ERNST ZERMELO 



Collected Works 
Gesammelte Werke 

VOLUME II 
BAND II 

Calculus of Variations, 
Applied Mathematics, 
and Physics 

Variationsrechnung, 
Angewandte Mathematik 
und Physik 




HEIDELBERGER AKADEMIE 
DER WISSENSCHAFTEN 




Springer 




Schriften der Mathematisch-naturwissenschaftlichen Klasse 
der Heidelberger Akademie der Wissenschaften 
Nr. 23 (2013) 




ERNST ZERMELO 



Collected Works 
Gesammelte Werke 



4^ Springer 




ERNST ZERMELO 



Collected Works 
Gesammelte Werke 




ERNST ZERMELO 



Collected Works 
Gesammelte Werke 



VOLUME I 
BAND I 

Set Theory, 
Miscellanea 

Mengenlehre, 

Varia 



VOLUME II 
BAND II 

Calculus of Variations, 
Applied Mathematics, 
and Physics 

Variationsrechnung, 
Angewandte Mathematik 
und Physik 




ERNST ZERMELO 



Collected Works 
Gesammelte Werke 



VOLUME II 
BAND II 

Calculus of Variations, Applied Mathematics, and Physics 
Variationsrechnung, Angewandte Mathematik und Physik 



Edited by 

Herausgegeben von 

Heinz-Dieter Ebbinghaus, 
Akihiro Kanamori 




Springer 




Editors 

Herausgeber 



Heinz-Dieter Ebbinghaus 
University of Freiburg, Germany 

Akihiro Kanamori 
Boston University, MA, USA 



ISBN 978-3-540-70855-1 e-ISBN 978-3-540-70856-8 

DOI 10.1007/978-3-540-70856-8 
Springer Heidelberg Dordrecht London New York 

Library of Congress Control Number: 2007921876 

Mathematics Subject Classification (2010): 01A55, 01A60, 01 A75, 49-02, 49K05, 49K15, 49Q05, 
49S05, 53C22, 62J15, 70B99, 76B47, 76M23, 76R50, 80A05, 82C03, 82C40 

© Springer-Verlag Berlin Heidelberg 2013 

This work is subject to copyright. All rights are reserved, whether the whole or part of the material is 
concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, 
broadcasting, reproduction on microfilm or in any other way, and storage in data banks. Duplication of 
this publication or parts thereof is permitted only under the provisions of the German Copyright Law of 
September 9, 1965, in its current version, and permission for use must always be obtained from 
Springer. Violations are liable to prosecution under the German Copyright Law. 

The use of general descriptive names, registered names, trademarks, etc. in this publication does not 
imply, even in the absence of a specific statement, that such names are exempt from the relevant 
protective laws and regulations and therefore free for general use. 

Cover design: WMXDesign GmbH, Heidelberg 
Typesetting and production: le-tex publishing services oHG, Leipzig 

Printed on acid-free paper 

Springer is part of Springer Science+Business Media (www.springer.com) 




Preface to the Zermelo edition 



This is a complete edition of the published works of Ernst Zermelo which 
moreover includes selected correspondence and unpublished manuscripts. 
Zermelo is generally acknowledged for his pioneering work in axiomatic set 
theory and for introducing the axiom of choice as a basic principle of math- 
ematics. In contrast, his work in applied mathematics and physics, despite 
its originality, is hardly recognized or has even been attributed to others. 
This edition of Zermelo’s collected works provides a picture of the entire 
mathematician. It appears in two volumes. The first volume comprises Zer- 
melo’s published papers in set theory and the foundation of mathematics 
together with isolated papers of an algebraic, analytic, or number-theoretic 
character. The second volume is dedicated to Zermelo’s work in the calcu- 
lus of variations, mathematical physics, and fluid dynamics. Both volumes 
are supplemented by selected notes and manuscripts, mainly from Zermelo’s 
Nachlass, which throw additional light on his papers, reflect his point of view, 
or are unpublished continuations of published work. To the best judgment of 
the editors, the selected notes and manuscripts fully and faithfully represent 
the essential unpublished writings of Zermelo concerning mathematics. Nev- 
ertheless, a possible edition of a third volume comprising further unpublished 
notes and letters from the Nachlass has expressis verbis been left open. 

Both volumes contain some writings by other authors which include con- 
tributions actually written by Zermelo or which react to criticism Zermelo 
had made. Details are given in the prefaces to the respective volumes. 

In order to provide access to a wider audience, the original papers are 
printed face to face with English translations. As both versions use the same 
layout, it is easy to go from the translation to the original version and vice 
versa. The layout itself tries to preserve the appearance of the original papers. 
For details we refer to the editorial information below. 

Each paper or coherent group of papers is preceded by an introductory 
note which comments on contents, motivation, aims, and influence of the 
paper(s) concerned. Written by an expert in the field, it came to its final 
form in discussions with the editors. 

Each volume contains a full bibliography of Zermelo together with a 
schematic curriculum vitae which will enable the reader to become acquainted 
with the personal circumstances from which a paper arose. In addition, Vol- 
ume I starts with a more detailed biographical sketch of Zermelo’s life and 
work. 

Many of these features found their inspiration in the exemplary edition 
of Kurt Godel’s collected works by Solomon Feferman, John W. Dawson, Jr., 
and others. 

The edition of Zermelo’s collected works has a prehistory. Already as early as 
1912, at the age of 41 and faced with a serious recurrence of his tuberculosis, 




x Preface to the Zermelo edition 

Zermelo conceived plans for an edition of his collected papers, but did not 
pursue them when his health improved. In 1949, under likewise deplorable 
personal circumstances, he tried again, this time approaching several pub- 
lishers, among them Springer- Verlag. But the difficult situation in post-war 
Germany precluded such an enterprise. Immediately after Zermelo’s death, 
in 1953, the historian of mathematics Helmuth Gericke and the philosopher 
Gottfried Martin, who had gotten to know Zermelo in the 1930s in Freiburg, 
started work on a two-volume edition, in 1956 gaining Paul Bernays as a 
third editor. Support was provided by the Kant-Gesellschaft. However, the 
plans were not realized; in 1962 work on the edition came to a definite end. 

When in early 2004 new plans for an edition of Zermelo’s collected works 
became more concrete, they found the enthusiastic support of Martin Pe- 
ters of Springer- Verlag. In discussions with him it became clear very quickly 
that the edition should provide English translations and detailed comments. 
As Zermelo had been a member of the Heidelberger Akademie der Wis- 
senschaften, the editors turned to the academy for financial support. The 
application found warm backing of Hans Gunter Dosch, then Sekretar of the 
class for mathematics and the sciences of the academy. The application was 
successful. Even more, besides providing generous funding, the academy of- 
fered to let the edition appear in its regular series of publications of the class 
for mathematics and the sciences published by Springer- Verlag. 

The editors wish to express deep gratitude to the Heidelberg academy for 
their ideal, financial support and to Springer- Verlag for their open-minded 
cooperation. In particular, many thanks go to Hans Gunter Dosch and Martin 
Peters. 



Freiburg, Toronto, and Boston 
September 2009 



Heinz-Dieter Ebbinghaus 
Craig G. Fraser 
Akihiro Kanamori 




Preface to volume II 



This second volume concludes the edition of Ernst Zermelo’s collected works. 
The volume focuses on his contributions mainly to analysis and physics. Ex- 
cept for an excursion into physical chemistry ( Riesenfeld and Zermelo 1909), 
the papers come from the decade around 1900 when Zermelo was in Berlin and 
Gottingen and about two years around 1930 when he was in Freiburg. They 
are accompanied by three items found in Zermelo’s and in David Hilbert’s 
Nachlass. For orientation especially about the personal circumstances accom- 
panying the genesis of the papers, the volume starts with Zermelo’s curricu- 
lum vitae, the one given in volume I. 

Zermelo’s works of an applied character may hold pioneering ideas and 
insights, but they did not receive the attention they deserved. One reason 
may be the sheer diversity of topics he treated. Of course, one should also 
take into consideration that starting soon after the turn of the century his 
mathematical work shifted elsewhere for more than two decades, to set theory 
and mathematical logic, research in these disciplines leading him to his most 
influential scientific achievements. 

The Berlin-Gottingen period comprises three topics: the calculus of vari- 
ations, the kinetic theory of gases, and hydrodynamics. 

The engagement with the calculus of variations started with Zermelo’s 
Ph. D. thesis (1894), written at the University of Berlin under the guidance 
of Hermann Amandus Schwarz. 

The engagement in the kinetic theory of gases started in 1896, also in 
Berlin, when Zermelo became an assistant to Max Planck. It lasted for about 
ten years. Its best-known part, a controversy with Ludwig Boltzmann, is 
described and analyzed here in full with the inclusion of Boltzmann 1896, 
1897. 

Zermelo’s interest in meteorology led him to hydrodynamics, work that 
culminated in his 1899 Habilitation thesis (1902a, si 902b, si 902c) in Got- 
tingen. 

In the late 1920s, Zermelo came back to his “old, even though mostly 
unhappy love for the ‘applications’”. The starting paper (1928) on the eval- 
uation of chess tournaments, with its early use of the maximum likelihood 
method, was to remain unknown until several decades later other people re- 
discovered his methods and results. Motivated by the circumnavigation of 
the earth by the airship Graf Zeppelin in August 1929, Zermelo wrote two 
papers (1930c, 1931a) on optimal steering methods of airships. Soon, how- 
ever, this return to mathematics of an applied character came to an end when 
Zermelo got involved in a serious foundational debate which fully occupied 
what strength was left him after a serious illness. 

The introductory notes are a crucial part of the Zermelo edition. Those who 
agreed to comment on a paper or a group of papers in this volume generously 




Preface to volume II 



xii 

shared their experience and knowledge with us and the potential reader. 
We at times had involved discussions toward securing the most informative 
and accurate presentations, and we appreciate the professionalism that was 
brought to bear. 

The translations of the original papers were carried out by Enzo de Pel- 
legrin. We again admire his extraordinary care and his feeling for both lan- 
guages when handling Zermelo’s style with its richness in nuances and its 
involved sentential structures. The introductory notes of Rudiger Thiele were 
translated by David Kramer who with diligence and care successfully mir- 
rored the style of the original German. 

We express our gratitude to all who have supported us during our work. 
In this connection we would like to mention Ruth Allewelt from Springer- 
Verlag, Andrea Kohler and Petra Mows of Le-Tex Publishing Services, and 
Marlies Wurth, the librarian of the Freiburg Mathematical Institute. 

Again, Martin Peters of Springer- Verlag was ready to offer valuable help 
and advice. 

We appreciate that Craig Fraser, while not being able to continue with 
his participation in the edition, was ready to contribute two substantial in- 
troductory notes. 

Freiburg and Boston Heinz-Dieter Ebbinghaus 

December 2012 Akihiro Kanamori 




Editorial information 



Layout. The layout of the texts as well as of the translations mirrors the lay- 
out of the originals. Emphasized words, i.e., words in italics or words spaced 
out or consisting of small capitals, are given in italics. Original pagebreaks 
are indicated in the texts by and the number of the new original page 
beginning there is given on the margin. 

Editorial annotations. These are set in double square brackets “[” and “]”. 

Misprints and errors. Small textual errors in the originals are tacitly cor- 
rected; larger ones are corrected with the corrections commented on in edi- 
torial annotations. 

Wrong words or words missing in the originals have been replaced or 
added in double square brackets. 

Misprints in mathematical expressions in the originals are not corrected 
in the texts. They are, however, corrected in the translations and noted by 
an editorial annotation. 

References. In the texts Zermelo’s references to the literature are not altered. 
Translations as well as introductory notes refer to the main bibliography at 
the end of the volume instead and have the form author (s) year of appearance, 
followed by an additional index a, b, c, . . . if necessary. An example: Hahn 
and Zermelo 190 f. If the authors are clear from the context, their names may 
be omitted; in such a case, 190f may be short for Hahn and Zermelo 19 Of. 
References to page numbers are kept in both the texts and the translations; 
they can be traced via the original pagebreaks and the original page numbers 
provided in the texts. 

Footnotes. Whereas the translations use natural numbers in ascending order 
as footnote marks, the texts preserve the original marks. It may thus happen 
that a page of the text may contain identical footnote marks. In such cases 
the original page numbers on the margin allow for quick correlation of mark 
and footnote. 

Figures. Whenever possible, a figure is located at the same position as in 
the original. If this is not possible for a figure, say Fig. n, then its original 
position is indicated on the margin by “Fig. n” and the figure itself appears 
as close as possible, at worst on the top of the next page. 




Copyright permissions 



The editors express their gratitude to the following copyright holders for 
granting permission for the inclusion of the original texts and their English 
translations: 

Universitatsbibliothek der Humboldt-Universitat Berlin (1894)', 

Wiley-VCH (1896a, Boltzmann 1896, 1896b, Boltzmann 1897, 1931a, 1933a ); 
Akademie der Wissenschaften zu Gottingen (1899a)', 

Niedersachsische Staats- und Universitatsbibliothek Gottingen (si 899b); 

S. Hirzel Verlag (1900, Riesenfeld and Zermelo 1909); 

B. G. Teubner (1902a, 1902d, Hahn and Zermelo 1904, 1906, 1930c); 
Universitatsarchiv Freiburg (si 902b, si 902c); 

Springer Verlag (1904a, 1928). 

Further thanks go to the archive of the Universitat Zurich for granting 
permission for the inclusion of the title photo (Universitatsarchiv Zurich, 
sign. (UAZ) AB. 1.1165). 




Editors and contributors of introductory notes 



Hans-Georg Bartel 
Institut fur Chemie 
Humboldt-Universitat zu Berlin 
12489 Berlin 
Germany 

hg.bartel@yahoo.de 

Alexey V. Borisov 
Institute of Computer Science 
Udmurt State University 
Izhevsk 426034 
Russia 

borisov@rcd.ru 

Rainer Bruggemann 
Department of Ecohydrology 
Leibniz-Institute of Freshwater Ecology 
and Inland Fisheries 
Miiggelseedamm 310 
12587 Berlin 
Germany 

brg_home@web.de 

Heinz-Dieter Ebbinghaus 
Mathematisches Institut 
Universitat Freiburg 
79104 Freiburg 
Germany 

lide@uni-freiburg. de 
Craig G. Fraser 

Institute for the History and Philosophy 
of Science and Technology 
Victoria College 
University of Toronto 
Toronto, Ontario M5S 1K7 
Canada 

cfraser@chass.utoronto.ca 

Larisa A. Gazizullina 

Institute of Computer Science 

Udmurt State University 

Izhevsk 426034 

Russia 

lag@rcd.ru 



Mark E. Glickman 
Center for Health Quality, Outcomes 
& Economics Research 
Edith Nourse Rogers Memorial Hospital 
200 Springs Road, mail drop 152 
Bedford, MA 01730 
USA 

mg@bu.edu 

Akihiro Kanamori 
Department of Mathematics 
Boston University 
Boston, MA 02215 
USA 

aki@math.bu.edu 

Sergei M. Ramodanov 
Institute of Computer Science 
Udmurt State University 
Izhevsk 426034 
Russia 

ramodanov@mail.ru 

Rudiger Thiele 
Senefelderstrafee 7 
06114 Halle 
Germany 

ruediger.thiele3@freenet.de 
Jos Uffink 

Department of Philosophy 
University of Minnesota 
Minneapolis, MN 55455 
USA 

jbuflink@umn.edu 




Contents of volume II 



Ernst Zermelo’s curriculum vitae, by Heinz-Dieter Ebbinghaus 1 

Zermelo 189 4 

Introductory note to 1894, by Craig G. Fraser 10 

Untersuchungen zur Variations- Rechnung 26 

Investigations in the calculus of variations 27 

Zermelo 1896a 

Introductory note to 1896a, 1896b, and Boltzmann 1896, 1897, 

by Jos Uffink 188 

Ueber einen Satz der Dynamik und die mechanische 

Warmetheorie 214 

On a theorem of dynamics and the mechanical heat theory 215 

Boltzmann 1896 

Introductory note: see under Zermelo 1896a 
Entgegnung auf die warmetheoretischen Betrachtungen 

des Hrn. E. Zermelo 228 

Rejoinder to the heat-theoretic considerations of Mr. E. Zermelo 229 

Zermelo 1896b 

Introductory note: see under Zermelo 1896a 

Ueber mechanische Erklarungen irreversibler Vorgange. 

Eine Antwort auf Hrn. Boltzmann’s „Entgegnung“ 246 

On mechanical explanations of irreversible processes. 

An answer to Mr. Boltzmann’s “rejoinder” 247 

Boltzmann 1897 

Introductory note: see under Zermelo 1896a 

Zu Hrn. Zermelo’s Abhandlung „Ueber die mechanische Erklarung 

irreversibler Vorgange 11 258 

On Mr. Zermelo’s paper “On the mechanical explanation of 

irreversible processes” 259 

Zermelo 1899a 

Introductory note to 1899a, by Rudiger Thiele 270 

Ueber die Bewegung eines Punktsystems bei 

Bedingungsungleichungen 272 

On the motion of a point system with constraint inequalities 273 

Zermelo si 899b 

Introductory note to sl899b, by Rudiger Thiele 280 

Wie bewegt sich ein unausdehnbarer materieller Faden unter dem 

Einfluss von Kraften mit dem Potentiale W{x,y,z)l 282 

How does an inextensible material string move under the 

action of forces with potential W{x,y,z)l 283 




XX 



Contents of volume II 



Zermelo 1900 

Introductory note to 1900, by Jos Uffink 286 

Uber die Anwendung der Wahrscheinlichkeitsrechnung auf 

dynamische Systeme 288 

On the application of the calculus of probabilities to 

dynamical systems 289 

Zermelo 1902a 

Introductory note to 1902a, by Alexey V. Borisov, 

Larisa A. Gazizullina, and Sergei M. Ramodanov 300 

Hydrodynamische Untersuchungen iiber die Wirbelbewegungen 

in einer Kugelflache (Erste Mitteilung) 316 

Hydrodynamical investigations of vortex motions in the surface 

of a sphere (First communication) 317 



Zermelo si 902b 

Introductory note: see under Zermelo 1902a 
Hydrodynamische Untersuchungen iiber die Wirbelbewegungen 



in einer Kugelflache (Zweite Mitteilung) 392 

Hydrodynamical investigations of vortex motions in the surface 

of a sphere (Second communication) 393 

Zermelo si 902c 

Introductory note: see under Zermelo 1902a 

§5. Die absolute Bewegung 464 

§5. The absolute motion 465 

Zermelo 1902d 

Introductory note to 1902d, by Rudiger Thiele 484 

Zur Theorie der kiirzesten Linien 488 

On the theory of shortest lines 489 

Zermelo 1904a 

Introductory note to 1904a, by Rudiger Thiele 494 

Uber die Herleitung der Differentialgleichung bei 

Variationsproblemen 496 

On the derivation of the differential equation in 

variational problems 497 

Hahn and Zermelo 1904 

Introductory note to Hahn and Zermelo 1904, by Rudiger Thiele. . . . 512 
Weiterentwicklung der Variationsrechnung in den letzten Jahren .... 532 
Further development of the calculus of variations in recent years .... 533 

Zermelo 1906 

Introductory note to 1906, by Jos Uffink 562 

Besprechung von Gibbs 1902 und Gibbs 1905 570 

Review of Gibbs 1902 and Gibbs 1905 571 




Contents of volume II xxi 

Riesenfeld and Zermelo 1909 

Introductory note to Riesenfeld and Zermelo 1909, 

by Hans-Georg Bartel and Rainer Bruggemann 594 

Die Einstellung der Grenzkonzentrationen an der Trennungsflache 

zweier Losungsmittel 600 

The settling of the boundary concentrations at the dividing surface 

of two solvents 601 

Zermelo 1928 

Introductory note to 1928, by Mark E. Glickman 616 

Die Berechnung der Turnier-Ergebnisse als ein Maximumproblem 

der Wahrscheinlichkeitsrechnung 622 

The calculation of the results of a tournament as a maximum 

problem in the calculus of probabilities 623 

Zermelo 1930c 

Introductory note to 1930c and 1931a, by Craig G. Fraser 672 

Uber die Navigation in der Luft als Problem der 

Variationsrechnung 678 

On navigation in the air as a problem in the calculus of variations . . 679 

Zermelo 1931a 

Introductory note: see under Zermelo 1930c 

Uber das Navigationsproblem bei ruhender oder veranderlicher 

Windverteilung 688 

On the navigation problem for a calm or variable wind distribution . 689 

Zermelo 1933a 

Introductory note to 1933a, by Heinz-Dieter Ebbinghaus 722 

Uber die Bruchlinien zentrierter Ovale. Wie zerbricht ein Stuck 

Zucker? 724 

On the lines of fracture of central ovals. How does a piece of sugar 

break up? 725 

Bibliography 735 

Index 773 




Contents of volume I 



Ernst Zermelo: A glance at his life and work, 

by Heinz-Dieter Ebbinghaus 3 

Ernst Zermelo’s curriculum vitae, by Heinz-Dieter Ebbinghaus 42 

Zermelo 1901 

Introductory note to 1901 , by Oliver Deiser 52 

Ueber die Addition transfiniter Cardinalzahlen 70 

On the addition of transfinite cardinal numbers 71 

Zermelo 1904 

Introductory note to 1904 an d 1908a, by Michael Hallett 80 

Beweis, dah jede Menge wohlgeordnet werden kann 114 

Proof that every set can be well-ordered 115 

Zermelo 1908a 

Introductory note: see under Zermelo 1904 

Neuer Beweis fiir die Moglichkeit einer Wohlordnung 120 

A new proof of the possibility of a well-ordering 121 

Zermelo 1908b 

Introductory note to 1908b, by Ulrich Feigner 160 

Untersuchungen liber die Grundlagen der Mengenlehre 1 188 

Investigations in the foundations of set theory 1 189 

Zermelo 1909a 

Introductory note to 1909a and 1909b, by Charles D. Parsons 230 

Sur les ensembles finis et le principe de l’induction complete 236 

On finite sets and the principle of mathematical induction 237 

Zermelo 1909b 

Introductory note: see under Zermelo 1909a 

Ueber die Grundlagen der Arithmetik 252 

On the foundations of arithmetic 253 

Zermelo 1913 

Introductory note to 1913 and D. Konig 1927b, by Paul B. Larson . . 260 
Uber eine Anwendung der Mengenlehre auf die Theorie des 

Schachspiels 266 

On an application of set theory to the theory of the game of chess . . 267 

Zermelo 1914 

Introductory note to 1914, by Ulrich Feigner 274 

Uber ganze transzendente Zahlen 278 

On integral transcendental numbers 279 

Landau 1917b 

Introductory note to Landau 1917b, by Heinz-Dieter Ebbinghaus. . . . 296 

Abschnitt 3 298 

Section 3 299 




xxiv Contents of volume I 



Zermelo si 921 

Introductory note to si 921, by R. Gregory Taylor 302 

Thesen fiber das Unendliche in der Mathematik 306 

Theses concerning the infinite in mathematics 307 

Zermelo 1927 

Introductory note to 1927, by Jurgen Elstrodt 308 

Uber das Mafi und die Diskrepanz von Punktmengen 312 

On the measure and the discrepancy of point sets 313 

D. Konig 1927b 

Introductory note: see under Zermelo 1913 

Zusatz zu §5 348 

Addition to §5 349 

Zermelo 1929a 

Introductory note to 1929a, by Heinz-Dieter Ebbinghaus 352 

Uber den Begriff der Definitheit in der Axiomatik 358 

On the concept of definiteness in axiomatics 359 

Zermelo si 929b 

Introductory note to si 929b and 1930b, 

by Heinz-Dieter Ebbinghaus 368 

Vortrags-Themata fur Warschau 1929 374 

Lecture topics for Warsaw 1929 375 

Zermelo 1930a 

Introductory note to 1930a, by Akihiro Kanamori 390 

Uber Grenzzahlen und Mengenbereiche. Neue Untersuchungen 

iiber die Grundlagen der Mengenlehre 400 

On boundary numbers and domains of sets. New investigations 

in the foundations of set theory 401 

Zermelo 1930b 

Introductory note: see under Zermelo si 929b 

Uber die logische Form der mathematischen Theorien 430 

On the logical form of mathematical theories 431 

Zermelo sl930d 

Introductory note to sl930d, by Akihiro Kanamori 432 

Bericht. an die Notgemeinschaft der Deutschen Wissenschaft fiber 

meine Forschungen betreffend die Grundlagen der Mathematik . . . 434 
Report to the Emergency Association of German Science 

about my research concerning the foundations of mathematics . . . 435 

Zermelo sl930e 

Introductory note to sl930e, by Akihiro Kanamori 444 

Uber das mengentheoretische Modell 446 

On the set-theoretic model 447 




Contents of volume I 



xxv 



Zermelo 1930f 

Introductory note to 1930f, by Albert Henrichs 454 

Aus Homers Odyssee 462 

From Homer’s Odyssey 463 

Appendix: Zermelo’s translation of Book V of the Odyssey 468 

Zermelo si 931b 

Introductory note to si 93 lb, si 931c, Godel 1931b, and si 93 Id, 

by Heinz-Dieter Ebbinghaus 482 

Brief an Kurt Godel vom 21. September 1931 486 

Letter to Kurt Godel of 21 September 1931 487 

Zermelo si 931c 

Introductory note: see under Zermelo sl931b 

Brief an Reinhold Baer vom 7. October 1931 490 

Letter to Reinhold Baer of 7 October 1931 491 

Godel 1931b 

Introductory note: see under Zermelo sl931b 

Brief an Ernst Zermelo vom 12. October 1931 492 

Letter to Ernst Zermelo of 12 October 1931 493 

Zermelo si 93 Id 

Introductory note: see under Zermelo s 193 lb 

Brief an Kurt Godel vom 29. Oktober 1931 500 

Letter to Kurt Godel of 29 October 1931 501 

Zermelo sl931e 

Introductory note to sl931e and sl933b, by Akihiro Kanamori 502 

Sieben Noten iiber Ordinalzahlen und grofie Kardinalzahlen 504 

Seven notes on ordinal numbers and large cardinals 505 

Zermelo si 93 If 

Introductory note to s 193 If and sl932d, 

by Heinz-Dieter Ebbinghaus 516 

Satze iiber geschlossene Bereiche 520 

Theorems on closed domains 521 

Zermelo sl931g 

Introductory note to sl931g, by R. Gregory Taylor 524 

Allgemeine Theorie der mathematischen Systeme 528 

General theory of mathematical systems 529 

Zermelo 1932a 

Introductory note to 1932a, 1932b, and 1935, 

by R. Gregory Taylor 532 

Uber Stufen der Quantifikation und die Logik des Unendlichen 542 

On levels of quantification and the logic of the infinite 543 




xxvi Contents of volume I 



Zermelo 1932b 

Introductory note: see under Zermelo 1932a 

Uber mathematische Systeme und die Logik des Unendlichen 550 

On mathematical systems and the logic of the infinite 551 

Zermelo 1932c 

Introductory note to 1932c, by Heinz-Dieter Ebbinghaus 556 

Vorwort zu Cantor 1932 560 

Preface to Cantor 1932 561 

Zermelo sl932d 

Introductory note: see under Zermelo si 93 If 

Mengenlehre 1932 564 

Set theory 1932 565 

Zermelo si 933b 

Introductory note: see under Zermelo sl931e 

Die unbegrenzte Zahlenreihe und die exorbitanten Zahlen 572 

The unlimited number series and the exorbitant numbers 573 

Zermelo 1934 

Introductory note to 1934, by Dieter Wolke 574 

Elementare Betrachtungen zur Theorie der Primzahlen 576 

Elementary considerations concerning the theory of 

prime numbers 577 



Zermelo 1935 

Introductory note: see under Zermelo 1932a 
Grundlagen einer allgemeinen Theorie der mathematischen 



Satzsysteme (Erste Mitteilung) 582 

Foundations of a general theory of the mathematical 

propositional systems (First notice) 583 

Zermelo si 937 

Introductory note to si 937, by Dirk van Dalen 600 

Der Relativismus in der Mengenlehre und der sogenannte 

Skolemsche Satz 602 

Relativism in set theory and the so-called Skolem theorem 603 

Zermelo si 941 

Introductory note to si 941, by Heinz-Dieter Ebbinghaus 606 

Brief an Paul Bernays vom 1. October 1941 608 

Letter to Paul Bernays of 1 October 1941 609 

Bibliography 611 

Index 645 




Corrections to volume I 



pp. xxiii-xxiv: Some English titles have been slightly changed. 

p. xxiv, line 12: Replace Zermelo 1903 ” by Zermelo 1904a”. 

p. 239, footnote 4: Replace “art. 64” by “art. 66” and delete the part in 
brackets. 

p. 364, lines 21/22 and p. 365, lines 23/24: Instead of 

“indent sie sich nicht sowohl auf die einzelnen ‘definiten’ Satze p als 
auf ihre Gesamtheit P bezieht” 

Zermelo means 

“indent sie sich nicht auf die einzelnen ‘definiten’ Satze p, sondern 
vielmehr auf ihre Gesamtheit P bezieht”. 

Hence, instead of 

“since it does not refer both to the individual ‘definite’ propositions 
p and to their totality P” 

the translation should be changed to 

“since it does not refer to the individual ‘definite’ propositions p, but 
rather to their totality P”. 

p. 367, line 1: Delete “0”. 

p. 537, line -18: Replace “ Husserl 1928 ” by “ Husserl 1922”. 

p. 539, line -1: Replace “partial ground relation” by “partial justification re- 
lation”. 

p. 540, line 6: Add a right parenthesis after “Sect. 1”. 

p. 541, line -3: Replace “the opening paragraph of Sect. 1” by “the paragraph 
preceding Sect. 1”. 

p. 642, line 11: Replace “1903” by “1904a” and shift the newly named item 
1904a behind item 1904- 




Ernst Zermelo 



Collected Works 
Gesammelte Werke 

Volume II 
Band II 

Calculus of Variations, 
Applied Mathematics, 
and Physics 

Variationsrechnung, 
Angewandte Mathematik 
und Physik 




Ernst Zermelo’s curriculum vitae 

Heinz-Dieter Ebbinghaus 



1871 

27 July: Zermelo is born in Berlin as the second child and the only son of the 
Gymnasialprofessor Theodor Zermelo and his wife Auguste nee Zieger. 

1878 

3 June: Death of Zermelo’s mother. 

1880 

April: Zermelo enters the Luisenstadtisches Gymnasium in Berlin. 

1889 

24 January: Death of Zermelo’s father. 

1 March: Zermelo finishes school. Remarks in his leaving certificate show that 
he suffers from physical fatigue. 

Summer semester summer semester 1890: Zermelo studies mathematics and 
physics at the University of Berlin with, among others, Lazarus Fuchs and 
Johannes Knoblauch. 

1890 

Winter semester 1890/91: Zermelo studies at the University of Halle-Witten- 
berg with, among others, Georg Cantor and Edmund Husserl. 

1891 

Summer semester 1891: Zermelo studies at the University of Freiburg, with 
his subjects including, as in Berlin and Halle, philosophy and psychology. 
Winter semester 1891/92 - summer semester 1897: Zermelo studies again 
at the University of Berlin with, among others, Ferdinand Frobenius, Max 
Planck, Hermann Amandus Schwarz, and Wilhelm Wien. 

1894 

23 March: Zermelo applies to begin the Ph. D. process. 

6 October: Zermelo obtains his Ph. D. degree. His dissertation Untersuchun- 
gen zur Variations- Rechnung was supervised by Hermann Amandus Schwarz. 
December - September 1897: Zermelo is an assistant to Max Planck at the 
Institute for Theoretical Physics of the University of Berlin. 

1895 

December: Zermelo completes his first paper, 1896a, which sets out his op- 
position to Ludwig Boltzmann’s statistical theory of heat. 

1896 

Summer: Zermelo applies for an assistantship at the Deutsche Seewarte in 
Hamburg, but then decides to pursue an academic career. 

H.-D. Ebbinghaus, A. Kanamori (Eds.), Ernst Zermelo - Collected Works/ 

Gesammelte Werke II, DOI 10.1007/978-3-540- 70856-8_l, Schriften der 
Mathematisch-naturwissenschaftlichen Klasse der Heidelberger Akademie 
der Wissenschaften 23, © Springer- Verlag Berlin Heidelberg 2013 




2 



Heinz-Dieter Ebbinghaus 



15 September: Zermelo completes his second paper, 1896b, opposing Boltz- 
mann. 

1897 

German translation Glazebrook 189 1 of Glazebrook 1894- 

2 February: Zermelo passes the state exam for Gymnasiallehrer (high school 
teachers) that allowed him to teach mathematics and physics as main subjects 
and chemistry, geography, and mineralogy as additional subjects. According 
to the reports Zermelo exhibited a broad knowledge in German literature, 
philosophy, and religion. 

19 July: Zermelo asks Felix Klein in Gottingen for support for his Habilitation. 
Winter semester 1897/98: Zermelo continues his studies at Gottingen with, 
among others, David Hilbert, Felix Klein, and Arthur Schoenflies. 

1899 

3 February: David Hilbert presents Zermelo’s first paper in applied mathe- 
matics, 1899a, to the Konigliche Gesellschaft der Wissenschaften zu Gottin- 
gen; it treats differential equations with inequalities. 

Zermelo initiates the Habilitation process with the Habilitation thesis “Hy- 
drodynamische Untersuchungen iiber die Wirbelbewegungen in einer Kugel- 
flache” the first part of which is published as 1 902a. The second part ( 1 902b, 
1902c) remained unpublished; it contains a solution of the 3- vortex problem 
on the sphere. 

4 March: Zermelo gives his Habilitation address, 1900, which proposes an 
alternative probabilistic approach to Bolzmann’s in the latter’s work in sta- 
tistical mechanics. He is granted the venia legendi for mathematics at the 
University of Gottingen. 

Around 1900 

Beginning of the cooperation with Hilbert on the foundations of mathematics. 
Zermelo formulates the Zermelo-Russell paradox. 

1900 

Winter semester 1900/01: Zermelo gives his first course on set theory, the 
main topic being the Cantorian theory of cardinals. 

1901 

9 March: David Hilbert presents Zermelo’s result on the addition of cardinals, 
1901, to the Konigliche Gesellschaft der Wissenschaften zu Gottingen. The 
proof uses the axiom of choice. 

1902 

12 May: Zermelo gives a talk on Frege’s foundation of arithmetic before the 
Gottingen Mathematical Society. 

Summer semester winter semester 1906/07: Zermelo receives a Privatdozen- 
ten grant. 




Ernst Zermelo’s curriculum vitae 



3 



Publication of 1902d, the first paper on the calculus of variations after the 
Ph.D. dissertation. It treats shortest lines of bounded steepness with or with- 
out bounded torsion. 

1903 

June: Zermelo is under consideration for an extraordinary professorship of 
mathematics at the University of Breslau. He is shortlisted in the second 
position after Gerhard Kowalewski, Franz London, and Josef Wellstein who 
are shortlisted aequo loco in the first position. 

1 December: Zermelo completes his second paper on the calculus of variations, 
1904a. It gives two simple proofs of a result of Paul du Bois-Reymond on the 
range of the method of Lagrange. 

1904 

Beginning of a life-long friendship with Constantin Caratheodory. 

Together with Hans Hahn, Zermelo writes a contribution on the calculus of 
variations, Hahn and Zermelo 1904, f° r the Encyklopadie der mathematischen 
Wissenschaften. 

August: Third International Congress of Mathematicians at Heidelberg. Ju- 
lius Konig gives a flawed refutation of Cantor’s continuum hypothesis. The 
error is detected by both Zermelo and Felix Hausdorff. 

24 September: Zermelo informs Hilbert about his proof of the well-ordering 
theorem and the essential role of the axiom of choice. The letter is published 
as 1904- 

lb November: During a meeting of the Gottingen Mathematical Society, Zer- 
melo defends his well-ordering proof against criticism by Julius Konig, Felix 
Bernstein, and Arthur Schoenflies. 

1905 

January: Zermelo falls seriously ill. In order to recover, he spends spring and 
early summer in Italy. 

German translation Gibbs 1905 of Gibbs 1902. 

Spring: Zermelo works on the theory of finite sets which finally results in 
1909a and 1909b. 

21 December: Zermelo receives the title “Professor”. The application had been 
filed by Hilbert in December 1904. 

1906 

Early that year: Zermelo catches pleurisy. 

Zermelo works on a book on the calculus of variations together with Cara- 
theodory. 

Zermelo publishes a final criticism of Boltzmann’s statistical interpretation 
of the second law of thermodynamics in the review 1906 of Gibbs 1902. 
Summer semester: Zermelo lectures on “Mengenlehre unci Zahlbegriff”. He 
formulates an axiom system of set theory which comes close to the one pub- 
lished by him in 1908. 

June: Medical doctors diagnose tuberculosis of the lungs. 




4 



Heinz-Dieter Ebbinghaus 



Summer: Zermelo spends a longer time at the seaside. 

Autumn: Zermelo is under discussion for a full professorship of mathematics 
at the University of Wurzburg. The professorship is given to the extraordinary 
Wurzburg professor Georg Rost. According to Hermann Minkowski Zermelo’s 
difficulties in obtaining a professorship are rooted in his “nervous haste”. 
Winter 1906/07 winter 1907/08: Several extended stays in Swiss health 
resorts for lung diseases. 

1907 

March: Zermelo applies for a professorship at the Academy of Agriculture in 
Poppelsdorf without success. 

May: During a stay in Montreux Zermelo finishes his paper 1909a. 

14 July and 30 July: During a stay in the Swiss alps Zermelo completes his 
papers on a new proof of the well-ordering theorem and on the axiomatization 
of set theory, 1908a and 1908b, respectively. 

20 August: Following an application by the Gottingen Seminar of Mathemat- 
ics and Physics, the ministry commissions Zermelo to give lecture courses in 
mathematical logic and related matters, thus installing the first official lec- 
tureship for mathematical logic in Germany. 

1908 

April: Fourth International Congress of Mathematicians in Rome. Zermelo 
presents his work on finite sets, 1909b. He becomes acquainted with Bertrand 
Russell. Together with Gerhard Hessenberg and Hugo Dingier he conceives 
plans for establishing a quarterly journal for the foundations of mathemat- 
ics. The project fails because of diverging views between the group and the 
Teubner publishing house. 

Summer semester: Zermelo gives a course on mathematical logic in fulfilment 
of his lectureship for mathematical logic and related topics. 

1909 

July: Zermelo is under consideration for an extraordinary professorship of 
mathematics at the University of Wurzburg. He is shortlisted in the first 
position. The professorship is given to Emil Hilb shortlisted in the second 
position. 

September: Completion of Riesenfeld and Zermelo 1909. 

1910 

24 January: The board of directors of the Gottingen Seminar of Mathematics 
and Physics applies to the minister to appoint Zermelo an extraordinary 
professor. 

21 January: Zermelo, being under consideration for a full professorship of 
mathematics at the University of Zurich, is shortlisted in the first position. 
24 February: The Regierungsrat of the Canton Zurich approves the choice of 
Zermelo. 

15 April: Zermelo is appointed a full professor at the University of Zurich for 
an initial period of six years. 




Ernst Zermelo’s curriculum vitae 



5 



1911 

28 January: Zermelo applies for leave for the coming summer semester be- 
cause of a worsening of his tuberculosis. 

February and March: Together with a partner, Zermelo applies for several 
patents concerning, for example, a regulator for controlling the revolutions 
of a machine. 

Zermelo is awarded the interest from the Wolfskehl prize, Hilbert being chair- 
man of the Wolfskehl committee of the Gesellschaft der Wissenschaften zu 
Gottingen. 

Summer semester - winter semester 1911/12: Leave for a cure because of 
tuberculosis. 

1912 

January: Serious worsening of tuberculosis diagnosed. 

Beginning of the cooperation with Paul Bernays who completes his Habi- 
litation with Zermelo in 1913 and stays at the University of Zurich as an 
assistant to Zermelo and later as a Privatdozent until 1919. 

August: Fifth International Congress of Mathematicians in Cambridge. Fol- 
lowing an invitation by Bertrand Russell, Zermelo gives two talks, one on 
axiomatic and genetic methods in the foundation of mathematical disciplines 
and one on the game of chess. The second one results in the paper 1913 which 
may be considered the first paper in game theory. 

Faced with the seriousness of his illness, Zermelo conceives plans for an edition 
of his collected papers. 

1913 

Spring: Zermelo is considered for a full professorship in mathematics at the 
Technical University of Breslau. He is shortlisted in the first position. The 
professorship is given to Max Dehn, shortlisted in the second position together 
with Issai Schur. 

December: Zermelo completes his paper 1914 on subrings of whole transcen- 
dental numbers of the field of the real numbers and the complex numbers, 
respectively; it makes essential use of the axiom of choice. 

1914 

Early that year: Zermelo has regular discussions with Albert Einstein. 
March: Operation of the thorax by Ferdinand Sauerbruch, the pioneer of 
thorax surgery. 

Around 1915 

Zermelo develops a theory of ordinal numbers where the ordinals are defined 
as by John von Neumann in 1923. 

1915 

Spring: A new serious outbreak of tuberculosis forces Zermelo to take a one- 
year leave. 

July: Waldemar Alexandrow completes his Ph. D. thesis Alexandrow 1915. 
It is the only thesis guided by Zermelo alone. Kurt Grelling’s thesis Grelling 




6 



Heinz-Dieter Ebbinghaus 



1910, which extends Zermelo’s theory of finite sets, was officially supervised 
by David Hilbert, but guided by Zermelo. 

Autumn: Several surgical treatments of a tuberculosis of the vocal chords. 

1916 

21 March: As his illness is expected to extend into the summer semester, 
Zermelo is urged to agree to retire. 

5 April: Zermelo agrees to retire. 

15 April: Zermelo retires from his professorship. 

31 October: Zermelo is awarded the annual Alfred Ackermann-Teubner prize 
of the University of Leipzig for the promotion of the mathematical sciences. 
Later prize winners include, for example, Emil Artin and Emmy Noether. 

1 November - February 1917: Zermelo stays in Gottingen. 

7 November 1916: Zermelo presents his theory of ordinal numbers to the 
Gottingen Mathematical Society. 

1917 

March October 1919: Zermelo stays in various health resorts in the Swiss 
alps. 

1919 

July: First draft of the paper 1928 wherein Zermelo develops a procedure 
for evaluating the result of a tournament by using a maximum likelihood 
method. 

November - March 1921: Zermelo stays at Locarno, Switzerland. 

1921 

Spring: Zermelo stays in Southern Tyrol and has correspondence with Abra- 
ham Fraenkel. 

6 May 1921: Fraenkel informs Zermelo about a gap he has discovered in 
Zermelo’s 1908 axiom system of set theory. 

10 May: In his answer to Fraenkel, Zermelo proposes a second-order version 
of the axiom of replacement in order to close the gap, at the same time 
criticizing it because of its non-definite character. 

17 July (?): Zermelo formulates his “infinity theses” ( sl921 ) where he de- 
scribes the aims of his research in infinitary languages and infinitary logic as 
carried out in the early 1930s. 

22 September: Fraenkel announces his axiom of replacement in a talk de- 
livered at the annual meeting of the Deutsche Mathematiker-Vereinigung. 
Zermelo agrees in principle, but maintains a critical attitude because of a 
deficiency of definiteness. 

1 October: Zermelo settles in Freiburg, Germany. 

1923 

Winter semester 1923/24: Zermelo attends Edmund Husserl’s course “Erste 
Philosophic”. 

- 1929: Cooperation with Marvin Farber on the development of a semantically 
based logic system, in 1927 leading to plans for a monograph on logic. 




Ernst Zermelo’s curriculum vitae 



7 



1924 

Summer: Zermelo loses interest in Husserl’s phenomenology. Discussions with 
Marvin Farber on the possibility of obtaining a professorship in the USA. 

1926 

or earlier: Zermelo starts a translation of Homer’s Odyssey, one that aims at 
“liveliness as immediate as possible”. 

22 April: Zermelo is appointed an ordentlicher Honorarprofessor at the Math- 
ematical Institute of the University of Freiburg. 

Winter semester - winter semester 1934/35: Zermelo gives regular courses in 
various fields of mathematics. 

- 1932: Zermelo works on the edition of Cantor’s collected papers, Cantor 
1932. He is supported by the mathematicians Reinhold Baer and Arnold 
Scholz and the philosopher Oskar Becker. The participation of Abraham 
Fraenkel leads to a mutual estrangement. 

1927 

12 June: Zermelo completes his paper 192 1 on measurability, where he 
presents results which he had obtained around 1914 and which had first been 
presented in Alexandrow’s thesis Alexandrow 1915. 

1928 

3 August: Zermelo completes his paper 1928 on the evaluation of tourna- 
ments. 

1929 

- 1931: Zermelo receives a grant from the Notgemeinschaft der Deutschen 
Wissenschaft (Deutsche Forschungsgemeinschaft) for a project on the nature 
and the foundations of pure and applied mathematics and the significance of 
the infinite in mathematics. 

May and June: Zermelo spends several weeks in Poland, giving talks in Cra- 
cow and Lvov and a series of talks in Warsaw. In the latter he presents his 
view of the nature of mathematics, arguing strongly against intuitionism and 
formalism. 

11 July: Zermelo completes his paper 1929a wherein he responds to criticism 
of his notion of definiteness as put forward by Abraham Fraenkel, Thoralf 
Skolem, Hermann Weyl, and others. 

18 September: At the annual meeting of the Deutsche Mathematiker-Ver- 
einigung in Prague Zermelo gives a talk, 1930c, on the solution of what is 
now called the “Zermelo navigation problem”. An extension of the result is 
published as 1931a. 

Arnold Scholz becomes an assistant at the Mathematical Institute of the Uni- 
versity of Freiburg, staying there for five years. Until his death on 1 February 
1942, he will be Zermelo’s closest friend and scientific partner. 




Heinz-Dieter Ebbinghaus 



1930 

13 April: Zermelo completes the paper 1930a wherein he formulates the 
second-order Zermelo-Fraenkel axiom system and presents an incisive pic- 
ture of the cumulative hierarchy. 

- 1932: Zermelo’s controversy with Skolem and Godel about Unitary math- 
ematics, in particular about Skolem’s first-order approach to set theory and 
Godel’s first incompleteness theorem. 

1931 

Zermelo develops infinitary languages and an infinitary logic as a response to 
Skolem and Godel. 

15 September: Zermelo presents his work on infinitary languages and infini- 
tary logic at the annual meeting of the Deutsche Mathematiker-Vereinigung. 
The talk results in the polemical paper 1932a and the straightforward 1932b. 
September/October: Correspondence with Godel about the proof of Godel’s 
first incompleteness theorem and Zermelo’s infinitistic point of view. 

18 December: Zermelo is elected a corresponding member of the Gesellschaft 
der Wissenschaften zu Gottingen on a proposal of Richard Courant. 

- 1935: Zermelo continues his research on infinitary languages and infinitary 
logic which results in the paper 1935. He works on large cardinals and on a 
monograph on set theory. 

1932 

Spring: Zermelo goes on a cruise that visits ancient sites in Greece, Italy, and 
Northern Africa. 

June: Zermelo devises an electrodynamic clutch for motorcars. 

July: Zermelo is invited to contribute a paper to a special issue of Zeit- 
schrift fur angewandte Mathematik und Mechanik in honor of its founder 
and editor Richard von Mises. He accepts the invitation and contributes the 
paper 1933a. 

1933 

16 February: Zermelo is elected an extraordinary member of the Heidelberger 
Akademie der Wissenschaften on a proposal of Heinrich Liebmann and Artur 
Rosenthal. 

1934 

Zermelo moves to Bernshof, a remote country house in the hilly outskirts of 
Freiburg where he lives until his death. 

2 November: Zermelo presents his paper 1934 on elementary number theory 
to the Gottingen academy. 

1935 

2 March: Zermelo resigns his honorary professorhip when denounced for his 
unwillingness to give the Hitler salute. 

- 1940: Smaller scientific projects in various fields of mathematics, further 
work on a book on set theory and work on a collection of mathematical 
“miniatures” representing several of his own results. 




Ernst Zermelo’s curriculum vitae 



9 



1937 

4 October: Zermelo gives a flawed refutation, si 937, of the existence of count- 
able models of set theory. 

1941 

19 July: Arnold Scholz organizes a colloquium in Gottingen on the occasion 
of Zermelo’s 70th birthday; Zermelo gives three talks that correspond to three 
items of his collection of mathematical “miniatures”. Other speakers include 
Konrad Knopp and Bartel van der Waerden. 

1944 

14 October: Marriage to Gertrud Seekamp. 

Zermelo suffers from a glaucoma that can no longer be treated adequately 
and will eventually lead to total blindness. 

1946 

23 January: Zermelo turns to the Rector’s office of the University of Freiburg 
to request his reappointment as a full honorary professor. 

23 July: Zermelo is reappointed an honorary professor. Because of age and 
increasing blindness, he is unable to lecture. 

1947 

- 1948: Zermelo stays several times in Switzerland. He can live there on his 
pension access to which is barred from Germany, leaving the Zermelos in a 
difficult financial situation. 

- 1950: In order to escape financial need, Zermelo tries to move back to 
Switzerland. His applications fail; the Swiss authorities argue that his Swiss 
pension does not suffice to provide for his wife as well. 

1949 

Spring: Zermelo tries to arrange an edition of his collected works. He fails. 

1953 

21 May: Zermelo, nearly 82 years old, dies at Bernshof in Freiburg. 

23 May: Zermelo is buried. 

- 1962: Helmuth Gericke and Gottfried Martin work on the edition of Zer- 
melo’s collected works. In 1956 Paul Bernays agrees to take part in the edi- 
tion. The project was not realized. 

The Zermelo Nachlass is acquired by the University of Freiburg. 

2003 

15 December: Gertrud Zermelo, 101 years old, dies at Bernshof in Freiburg. 




Introductory note to 1894 

Craig G. Fraser 



In 1894 Zermelo published his doctoral dissertation from the University of 
Berlin, written under the direction of Hermann Amandus Schwarz and de- 
voted to a study of Karl Weierstrass’s methods in the calculus of variations. 
In an introductory note Zermelo stated that he had become familiar with 
the contents of Weierstrass’s lectures in 1892 from the copy in the “Ma- 
thematischer Verein” in Berlin, as well as from a lecture given by Schwarz. 
Weierstrass had investigated the simplest case in which only the first deriva- 
tives of the variables appear in the variational integrand function. Zermelo’s 
main goal was to extend Weierstrass’s results on necessary and sufficient con- 
ditions involving the “excess” or E function (the material in the twentieth to 
the twenty-third lectures of Weierstrass’s lectures as they were eventually 
published (192 7)) to variational problems in parametric form in which the 
integrand contains derivatives of order higher than one. 

1. Sufficient conditions before Weierstrass 

A major goal of the calculus of variations in the nineteenth century was to 
identify conditions that ensure that a proposed solution to a given variational 
problem is a maximum or a minimum. Any such solution will have to satisfy 
the Euler differential equation and will also have to satisfy Legendre’s condi- 
tion. It was noticed that a function that satisfied these conditions turned out 
in certain instances not to be a genuine extremum. It was required to assemble 
a set of conditions that taken together are sufficient to ensure a maximum or 
a minimum. In researches of the late 1830s Carl Gustav Jacobi (183 7, 1838) 
introduced some new ideas in this direction that became the basis for a very 
active program of research. Jacobi formulated a certain condition, known in 
the later subject as Jacobi’s criterion, that must be satisfied by any solution 
to the problem. Jacobi’s theory was also based on a new transformation of 
the second variation. The variational integrand was expressed in a form that 
enabled one to infer Legendre’s condition for very general integrals. (The 
relevant history is presented in Todhunter 1861, Goldstine 1980 and Fraser 
2003.) 

The primary object of interest here is an integral involving a single inde- 

f b 

pendent variable of the form / f(x,y,y',...,y^)dx, where the integrand 

J a 

function / is a function of x, y and the derivatives of y with respect to x 
up to order n. It is necessary to find the particular function y = y(x) that 
maximizes or minimizes this integral. In the elementary case where n = 1, 
researches succeeded in providing a completely satisfactory theory. In 1857 

H.-D. Ebbinghaus, A. Kanamori (Eds.), Ernst Zermelo - Collected Works/ 

Gesammelte Werke II, DOI 10.1007/978-3-540-70856-8_2, Schriften der 
Mathematisch-naturwissenschaftlichen Klasse der Heidelberger Akademie 
der Wissenschaften 23, @ Springer- Verlag Berlin Heidelberg 2013 

Introductory Note, @ Springer- Verlag, Paper, @ Universitatsbibliothek der Humboldt- 
Universitat Berlin, English translation with kind permission by Springer- Verlag 




Introductory note to 1894 



11 



Ludwig Otto Hesse showed in this case that if the Euler, Legendre and Jacobi 
conditions are satisfied then the resulting curve is indeed a maximum or a 
minimum. The more interesting and much more difficult case occurred when 
n > 2. Here it was found that constants appearing in functions required in 
the transformation of the second variation must satisfy certain conditions. 
It was necessary to show that it was possible to find a set of constants that 
worked in the general case. Essentially the problem was one of existence, of 
finding suitable mathematical objects that allowed the transformation to take 
place. (A succinct statement of the point in question is given in Lindelof and 
Moigno 1861.) 

The central question was resolved by the Leipzig mathematician Adolph 
Mayer in his Habilitation thesis 1866, a work whose core content was pre- 
sented by Mayer two years later in an article in Crelle’s journal (1868). Mayer 
showed that if Jacobi’s criterion held, then it was possible to carry out the 
desired transformation of the second variation. Assuming the validity of Leg- 
endre’s criterion, one may infer that the given function satisfying the Euler 
equation is indeed a maximum or a minimum. Mayer presented his result in 
a very general setting, using a formulation of the variational problem that 
had been developed by Alfred Clebsch. Mayer’s investigation showed both 
technical sophistication and a deep understanding of the theoretical issues at 
the foundation of the theory. 

2. Weierstrass 

Weierstrass’s contributions to the calculus of variations were a product of 
his middle and late years. Although he began lecturing on the subject at 
the University of Berlin as early as 1865, his most significant results were 
presented in the summer lectures of 1879, when he was sixty-three years old. 
The edition which was eventually published in 1927 is based on these as well 
as a second set of lectures given in 1883. Although this delay in publication 
somewhat limited the dissemination of his ideas, he exerted considerable in- 
fluence on contemporary variational research. Copies of his notes circulated 
privately and his results began to be disseminated in published form by other 
researchers beginning in the middle 1890s. The appearance of Zermelo’s dis- 
sertation in 1895 was among the first publications of Weierstrass’s ideas, 
developed in a more general setting than the one adopted by Weierstrass. 

More than any other researcher Weierstrass established the critical out- 
look of the calculus of variations as a modern mathematical subject. In his 
lectures the distinction between necessary and sufficient conditions appears 
clearly for the first time. He carefully specified the continuity properties that 
must be satisfied by functions and their variations. In problems of constrained 
optimization he used theorems on implicit functions to ensure that the opti- 
mizing arc was embedded in a suitable family of comparison curves. 

Traditionally researchers in the calculus of variations did not identify at 
the outset of their investigation the precise class of comparison arcs in a given 




12 



Craig G. Fraser 



variational problem. There was no prior logical conception concerning the 
nature of this class. However, the (5-process introduced by Lagrange required 
that both the comparison arc and its slope at each point differ by only a small 
amount from the value and slope of the solution curve. This condition was 
imposed by the nature of the variational process, which involved expanding 
the integrand function as a Taylor series and investigating the behavior of 
the second variation arising in this expansion. Isaac Todhunter (1871, 269) 
in an essay on what were known as “discontinuous” solutions seems to have 
been the first to explicitly call attention to this limitation on the class of 
comparison arcs: 

| . . . ] if we assert that the relation [Euler equation] does give a min- 
imum, we must bear in mind that this means a minimum with re- 
spect to admissible variations [. . . ] our investigation is not applicable 
to such a variation as would be required in passing from the cycloid 
to the discontinuous figure: in such a passage p [= 8y'] would not 
always be indefinitely small. Of course it might be possible to give 
some special investigation for such a case, but certainly the case is 
not included in the ordinary methods of the Calculus of Variations. 

In his Berlin lectures Weierstrass developed an alternative to the tradi- 
tional expansion methods that extended the variational theory to a larger 
class of comparison curves. The precise nature of these curves was still de- 
termined by the particular technical requirements of the new method, but 
the logical orientation of the subject had shifted. In earlier variational re- 
search the nature of the mathematical objects was determined implicitly by 
the variational process that was employed. By contrast, in Weierstrass there 
was a self-conscious and explicit focus on the objects being studied. His work 
involved a more intimate connection between the foundations of real anal- 
ysis and the collection of concrete techniques and results that made up the 
variational theory. 

It is necessary to call attention to one aspect of the style in which Weier- 
strass developed the theory. Traditionally researchers in the calculus of varia- 
tions had adopted what is referred to as the ordinary or functional approach, 
in which the curve is expressed as y = y(x) and the variational integrand (in 
the simplest case) takes the form f(x,y,y'). A distinctive aspect of Weier- 
strass’s approach was his adoption of a parametric approach. The curve C is 
represented parametrically in the form y = y(t) and x — x(t). Here the vari- 

ational integrand takes the form I = / f(x, y, x ' , y') dt where x' = — and 

Jt 0 dt 

dy 

y' = — . In the parametric (or homogeneous) formulation of the variational 
problem it is necessary to impose conditions on the variational integrand in 
order to ensure that the problem is independent of the particular param- 
eterization chosen. One must attend to these conditions in developing the 
theory. 




Introductory note to 1894 



13 



Weierstrass used a parametric approach throughout his lectures on the 
calculus of variations. Researchers who adopted his new method tended to 
also use a parametric approach, and this was true in the case of Zermelo. 
However, not all researchers followed this practice. Although the parametric 
approach has certain advantages, particularly from a geometric viewpoint, 
its analytical development is less natural than the ordinary theory. During 
the years around 1900 when Weierstrass’s ideas were becoming more widely 
known, researchers such as Oskar Bolza (1909), William Osgood (1901) and 
Emile Goursat (1905) went to some effort to reformulate his results in terms 
of the ordinary theory. In the large majority of the textbook literature of the 
past one hundred years the ordinary approach is taken as the standard for- 
mulation of the variational problem while the parametric theory is presented 
as a special subject. 

For the sake of exposition we will adopt the ordinary theory in explaining 
some of the basic ideas that underlie Weierstrass’s theory. At the conclusion 
we indicate by way of comparison the parametric form in which Weierstrass 
originally presented his result. (In our account we have adopted the sensible 
notation used by Adolf Kneser to denote extremal and comparison curves: 
y = y(x) is the comparison curve, while y = y(x) is the extremal curve; both 
Weierstrass and Zermelo use the opposite convention.) 

Suppose y = yo(x) is an arc Co on which the variational integral / = 
r b 



f(x , y, y') dx is a minimum. (The case of a maximum is similar, with the 



inequalities reversed.) In the traditional formulation of the theory involving 
expansion methods and the second variation, the conditions that must be 
satisfied are all specified in terms of the function y = yo(x) and this function 
alone. Thus y = yo(x) will be a solution to the Euler equation and we must 

have d ^ 1 ^ ^ 1 ^ > 0 (Legendre‘s condition) and the Jacobi criterion 

ay'- 

must hold. One typically supposes that the family of comparison curves is 
of the form C : y = yo(x) + e£(a:). Here £(;r) is any function subject to 
the usual continuity restrictions with ((a) = ((b) = 0. More generally we 
may have a family of comparison curves of the form C : y = y(x, e) where 
y(x, 0) = yo(x) and y(a,e) = y(b,e) = 0. It is evident that the class of 
comparison curves is very extensive. Nevertheless, as Todhunter observed 
in 1870, it is also clear that there are some restrictions on this class. For 
example, if y = yo(x) + e((x) (e small) then y'(x) — y' 0 (x) = e('(x) with 
similar relations for higher derivatives of x. It follows that the neighboring 
curve y = y(x) differs by only a small amount from the optimizing curve 
V — yo(x) not just for corresponding values of y but for derivatives of y of all 
orders. 

It turns out that it is possible for a variational integral to be a minimum 
for the function y = yo(x), considered with respect to a class of comparison 
curves of the type y = y(x, e), but not be a minimum if we allow comparison 




14 



Craig G. Fraser 



curves whose slope differs by a finite amount from y = yo(x). The following 
simple example illustrates this situation. It is taken from Bolza’s Lectures 
on the calculus of variations (1904, 74), one of the earliest expositions of 
the new theory developed by Weierstrass and extended by Zermelo. (The 
term “die neue Variationsrechnung” was sometimes used in reference to the 
Weierstrass theory.) We have f(x,y,y') = y 12 + y ' 3 defined on the interval 

[0, lJ.Tlie variational integral is / = / (y , 2 +y 13 ) dx. It follows that a solution 

Jo 

to the Euler equation will be y' = constant. Since the solution must pass 
through the endpoints it follows that the hypothetical minimizing curve is 

f y ? y*') 

simply y = 0. We have — ‘ ’ = 2 > 0, so the Legendre condition is 



dy>2 

satisfied. It is also the case that Jacobi’s condition holds in this example. 
Hence the curve y — 0 minimizes the integral with respect to comparison 
arcs of the form y = y(x , e). Consider now the comparison curve C consisting 
of two straight lines, the first joining the origin (0, 0) to the point (1 — p, q) 
and the second joining (1 — p, q) to (1,0) (see Fig. 1). Here p is a number 
with 0 < p < 1 and q is a small positive quantity. For this comparison curve 
we have 



AI = 



P( 1 ~P) 



(1 + 



q _ 

i -p p ' 



We can make C lie within any neighborhood of y = 0 by making q sufficiently 
small. With q specified, it is clear that AI < 0 for p <C q. Hence I is not a 
minimum for the larger class of curves that includes the curve C. 




Fig. 1 (p. 74 of Bolza 1904 ) 



In the terminology that was introduced by Kneser in 1900 and became 
standard, the traditional variational theory yields sufficient conditions for 
a weak extremum. Here each comparison curve is close to the minimizing 
curve at y and at all derivatives of y. By contrast, a solution will be a strong 
extremum if it is a minimum for the wider class of curves which are close to 
the solution curve but may have a slope that differs by a finite amount from 
the solution curve. 

Consider again the problem of finding the curve Cq : y = Vo(x) that 

r b 

maximizes or minimizes I = / f(x,y,y')dx. Suppose that the Euler and 





Introductory note to 1894 



15 



Jacobi conditions hold for the arc Cq. We now enlarge the class of possible 
comparison curves to include ones whose slope differs by a finite amount from 
that of C. In order to establish that Co is a minimum with respect to this 
enlarged class of comparison curves it is necessary to formulate a condition 
that involves not just the function y = yo(x) but also the curves C : y = y( x) 
in the comparison class. Perhaps the simplest approach would be simply to 
require that 

f(x,yo,y' 0 ) < f(x,y,y') (a<x <b) 

for all comparison curves y = y{ x). Imposing this condition is evidently not 
very informative, and indeed is only a restatement of the problem. Weierstrass 
succeeded in formulating a more meaningful condition involving a function 
E called the excess function. We are given the proposed minimizing arc Co : 
y = yo(x). We consider any comparison curve C : y = y[x) and take any 
point on this curve with coordinates x and y. By assumption the point ( x , y) 
on the comparison curve C is close to the point (x,yo) on Cq- Let y'(x) be 
the slope of the comparison curve at the given point. This quantity may vary 
by an arbitrary amount from y'o{x). 

A solution to the Euler equation is called an extremal. We typically re- 
quire that such a solution passes through the initial point. A key idea in- 
troduced by Weierstrass was to introduce a function p(x, y)— known as the 
slope function— defined as the slope of the extremal passing through the 
point (x,y) at this point. We now introduce the excess function E(x,y,y' ,p) 
defined as 



df 

E{x,y,y ,p) = f(x,y,y) - f(x,y,p) - (y' - p) — (x,y,p) . (1) 

Consider a region or strip about the curve Co and suppose that for each 
point in this strip it is possible to determine an extremal joining the initial 
point and the given point. Such a set of solutions to the Euler equation is 
today called a field of extremals, a term introduced by Kneser. Consider the 
condition 



E{x,y,y',p)>0. (2) 

This condition is known as Weierstrass’s condition. If this condition is sat- 
isfied for all comparison curves y = y(x) on the interval then the value of 
I along Co is less than its value along any member of C of the comparison 
class. We may conclude in this case that Co : y = yo(x) is a strong minimum. 

In later mathematics this result would be proved using something called 
the Hilbert invariant integral, introduced by David Hilbert in 1900 to simplify 
Weierstrass’s method. Here we present Weierstrass’s original idea, which was 
also adopted by Zermelo. The minimizing arc is given as Cq : (x, yo(x)) while 




16 



Craig G. Fraser 




the neighboring curve C is given as C : (x,y( x)). In Fig. 2 Cq is the bottom 
arc 01, while the neighboring curve C is the arc 0321. We suppose that 
C lies in a narrow region about Cq. This region or “field” has the following 
property: for any point within it, there is a unique solution curve to the Euler 
equation — an extremal — passing through the initial point 0 and the given 
point. We designate extremal curves using the functional notation y = y(x). 
Let (x,y) be any point on C; in Fig. 1 this point is labelled 2. Consider the 
extremal curve ( x,y(x )) passing from 0 through 2; in Fig. 1 it is the curve 
02. We introduce the function <f>(x) defined as 

f(x,y,lf)dx + j f(x,y,y')dx. (3) 




This integral is taken along the extremal arc from 0 to 2 and then along 
the comparison arc from 2 to 1. Evidently we have <f>( 0) = / f(x,y,y')dx 

l l ° 

and (j>{l) = f f(x,yo,y' 0 )dx. The statement that f f(x,yo,y'o)dx is a 



Jo Jo 

minimum is equivalent to the inequality (f)(0) > cf>( 1), which would follow if 

we are able to show that <f>(x) is a decreasing function of x. To do this we 
calculate the derivative of cf>(x) and show that it is negative. 

Let the integrals Iq 2 and /21 be defined as 



I 02 = / f(x,y,y') dx, I 2 1 

Jo 




f(x,y,y')dx. 



(4) 



From the standard formula for the variation of the integral when the endpoint 
is allowed to vary in both the x and y directions we have 

SI 02 = (x,y,y')5y+ (f(x,y,y') -y'^(x,y,y'))5x. (5) 

We now let Sx = dx and Sy = dy. Note that at the point 2 we have y = y 
and y' = p(x,y), where p is the slope function for the given field. Hence (5) 




Introductory note to 1894 



17 



becomes 



dl 02 

dx 



(x, y , p)y' + f(x, y , p) -p^(x, y, p) 
df 

-(x,y,p)(y'-p) + f(x,y,p). 



The derivative of I 21 is given immediately as 



dl 21 
dx 



-f(x,y,y') ■ 



(6) 



( 7 ) 



Thus we have 



4>{x) 



dl 02 
dx 



dl 21 
dx 



Qf 

~(f(x, y, y') - f{x, y , p) - (y' - p) 7 (a;, y, p)) , 



or 

<t>'{x) = -E(x,y,y',p) . (8) 



If E(x,y,y',p) > 0 on the interval then <t>'(x) is negative and <f>(x) is a de- 
creasing function, which is what was required to be proved. 

It is apparent that 0(0) — 0(1) = f E(x, y, y',p) dx. But 0(0) — 0(1) is 



Jo 

equal to AI , the variation of the integral with respect to the comparison arc. 
Hence we have 



AI = 




E(x,y,y',p) dx. 



( 9 ) 



(9) is known in the modern literature as Weierstrass’s theorem, although it 
does not appear explicitly in Weierstrass’s lectures. From (9) it is clear why 
the function E is called the excess function, since the excess of the variational 
integral / in going from y = yo(x) to y = y(x) is the integral of E over the 
given interval. 

In Weierstrass’s original parametric approach the variational integrand 
/■ 4l 

takes the form / F(x,y,x r ,y r ) dt. The minimizing arc is given as Co : 

Jto 

(xo(t),yo(t)) while the neighboring curve C is given as C : (x(t),y(t)). In 
Fig. 2 Cq is the bottom arc 01 , while the neighboring curve C is the arc 
0321. We suppose that C lies in a narrow region about Cq. This region is 
supposed to be a “field” in the sense defined above: for any point within it 
there is a unique solution curve to the Euler equation passing through the 
initial point and the given point. Let (x(t),y(t) be any point on C; in Fig. 2 
this point is labelled 2. Consider the extremal passing through this point; in 
Fig. 2 it is the curve 02. Let p(t),q(t) be the coordinate slope functions of 




18 



Craig G. Fraser 



the extremal at (x,y). The excess function E in parametric form is defined 
as 

dF dF 

E(x,y,p,q,x ,y') = F(x, y, x , y ) - x' — (x,y,p, q) - y'— (x, y,p, q) . (10) 

If for each such comparison curve C we have 

E(x,y,p,q,x',y')>0 (11) 



for all values of x, y, x' , y' then we may conclude that Co minimizes the inte- 
gral / F(x,y,x',y')dt. 

Jto 



In the ordinary calculus a condition that y = y(x) be a minimum at x = a 
dy 

is that — -(x = a) =0. No reference is made in this condition to neighboring 
ax 

values of a. Similarly, in the case of weak extrema in the calculus of variations 
the conditions are formulated solely in terms of the curve C : y = yo(x) and 
do not involve any reference to neighboring curves or functions. By contrast, 
Weierstrass’s condition involves the comparison curve as well as the field 
function p(x, y) defined in a neighborhood of C. It should be noted that 
while it is true that Weierstrass has obtained a stronger result, this is possible 
because the condition that must be satisfied is more restrictive; the stronger 
result is achieved at a higher price. 



3. Zermelo’s dissertation 

It was inevitable that Zermelo’s readership would be restricted because he 
was extending a mathematical theory that itself had not been published and 
that would have been familiar only to a fairly small group of researchers either 
at German universities or who had studied there. Furthermore, Weierstrass’s 
parametric approach was not widely used in the calculus of variations; even 
investigators such as Ludwig Scheeffer (1885), Georg Erdmann (1877) and 
Edmund Husserl (1882) who were in a general sense part of the Weierstrass 
“school” and were influenced by his ideas employed the ordinary formulation 
of the variational problem in their researches of the 1870s and 1880s. 

Zermelo’s dissertation was also written in a rather formal manner, with 
very limited exposition of basic ideas and principles, and was excessively con- 
cerned with procedural matters, detailed formulations required in the general 
theory, and points of rigor. In matters of style, in its propensity for strenuous 
formal development, his approach bore similarities with the work of such ear- 
lier researchers as Hesse. Zermelo’s work displayed as well a new element in 
variational research, a tendency to want to develop the subject from a larger 
viewpoint and to present the results as an instance of some more general and 
yet to be precisely specified subject. This tendency was manifested in his 
study of homogeneity properties of functions in the first chapter, as well as 
in his classification of families of curves in the second chapter. 




Introductory note to 1894 



19 



Zermelo’s dissertation would have been of primary interest to a reader 
who was either already familiar with Weierstrass’s lectures or who was very 
motivated to learn about his new methods. For such a reader, the work would 
have been a valuable piece of mathematical research and exposition. In a 
fully detailed and methodical manner Zermelo developed a general theory, 
showing fully the non-trivial considerations that are involved in extending 
Weierstrass’s methods to the problem n > 1. The first chapter was devoted 
to the homogeneity relations that must be satisfied in the parametric theory; 
the second to necessary conditions; the third to the excess function; and the 
final fourth chapter to sufficiency conditions involving the excess function. 




To indicate the basic idea behind Zermelo’s development we describe how 
it plays out for the ordinary problem in the case where the variational inte- 
grand is a function of x, y,y' and y" . This was the setting in which Jacobi 
and so many other researchers set forth the theory. As before let y = yo{x) be 
the solution curve to the Euler equation joining 1 and 2. In Fig. 3, from p. 90 
of Zermelo’s dissertation, this curve is denoted as a (note that the points 
are numbered slightly differently than in Weierstrass). We suppose there is a 
strip or region (what later was called a field) about a with the property that 
there is a unique extremal joining 0 (a point very close and to the left of 1 on 
a) and any given point 3 of the region. (In a technical refinement of Weier- 
strass’s method, Zermelo takes the common starting point of the extremals 
to be 0 rather than 1 in order to simplify the analysis needed to establish 
the existence of the desired field. The value of the variational integral from 0 
to 1 is taken to be negligible.) An arbitrary comparison curve 132: y = y(x) 
is designated as a. It is assumed that for each point 3 on a there is a unique 
extremal curve (a solution to the Euler equation) y = y(x), joining 0 to 3. 
This curve is designated as u in Fig. 3. Consider the function (f>{x) given as 

f(x,y,y',y")dx + [ f(x,y,y',y")dx. ( 12 ) 

J 3 

Let the integrals in (12) be designated as 1 03 and 1 32 : 

f(x,y,y\y")dx, I 32 = f f(x,y,y',y")dx. (13) 

J 3 






20 



Craig G. Fraser 



/ 03 is evaluated along the extremal curve u from 0 to 3, while I 32 is evaluated 
along the comparison curve a from 3 to 2. The key idea is to write SI 03 using 
the variable endpoint formula applied to the case where there are second 
derivatives in the variational integrand. We have 



(14) 



T df(x,y,y',y") df(x, y, y’, y") , ( _ 

SI 03 = 7^7 Sy + ^ Sy + (/( x, y,y ,y ) 



Of/ 



dy 



_,,df{x,y,y',y") d df(x,y,y’,y")^ _„d f{x,y,y’,y")^ 

-y( — w — ^ ^ — ’~ y — ^ — ’ 5x ' 



dx 



dy 



dy 



We now let Sx = dx and Sy = dy. Note that at the point 3 we have y = y 
and y' = p(x,y), y” = q(x,y), where p and q are the field functions for the 
first and second derivatives of the extremal passing though 3. With these 
designations (14) becomes 



dl 03 df{x,y,p,q) , df(x,y,p,q) „ 

~ ~y H 5 y +f{x,y,p,q) 



dx 



We also have 



dp 



dq 



,df(x,y,p,q) d df(x,y,p,q). df(x,y,p,q) 

-»( — gy s; — — I" 9 ' 



dl 



32 



dx 



dx dq 



= -f(x,y,y',y")- 



dq 



Hence the derivative of 4>{x) is 



&(x)= —E(x, y, y', y",p, q) dx , 

Jo 



where 



(15) 



(16) 



(17) 



E(x,y, y', y",p,q) = f(x,y,y',y") - f(x,y,p,q) (18) 

, ,df d df s „ df 
- {v -^dp-T^-iv -<i) w 

If E(x,y,y' , y" ,p,q) > 0 then it follows that 

/* 2 /* 2 

4 >'{x) < -1 and / f(x,yo,y' 0 ,y'o)dx< f(x,y,y',y")dx. 

Jo Jo 

It is also apparent from (17) that <j>(0) — <j>(l) = AI. Hence we have the follow- 
ing expression for the variation of the integral with respect to the comparison 
arc 032: 



AI = E(x,y,y',y",p,q)dx. 

Jo 



(19) 




Introductory note to 1894 



21 



(19) is Weierstrass’s theorem and is the culminating result of Zermelo’s trea- 
tise. It is stated on p. 79 of his dissertation in parametric form for the general 
case involving derivatives of order up to n. (It should be noted that the appel- 
lation “Weierstrass’s theorem” was not used by Zermelo.) In the case n = 1 
we have: 

AI = f E(x,y,x',y',p,q)da , (20) 

J O’ i 

where a is the parameter and where the excess function in parametric form 
is given as 



E = f(x,y,x', y') - f(x, y, p , q) - 



df(x,y,p,q) 

dx' 



{x'-p)~ 



df(x,y,p , q) 
dy' 



(y'-o)- 

( 21 ) 

In the traditional theory of sufficiency based on expansion methods it is 
necessary to ensure that there is no admissible Sy which makes the second 
variation vanish. A point at which the second variation vanishes came to be 
called a conjugate point (a term coined by Weierstrass) and the problem has 
a solution only if there are no conjugate points on the interval. It is also 
necessary to show that there exist certain functions that allow one to trans- 
form the second variation to a suitable quadratic form. Mayer’s achievement 
in his publications of the 1860s was to show in a very general setting that 
if there is no conjugate point on the interval then it is possible to produce 
the requisite functions needed in the transformation of the second variation. 
The basic problem here is one of mathematical existence. Zermelo following 
Weierstrass was confronted with a different kind of existence question. In or- 
der to carry out the derivation of equation (20) it is necessary to embed the 
extremal joining the endpoints in a field of extremals. Zermelo supplemented 
his presentation of (20) with an extended discussion of the existence of such 
a field and the conditions that are required for it. His approach was to write 
down an analytical condition stating that there is no conjugate point on the 
interval. From this condition it is shown that there is a strip or field about 
the given extremal joining the points A and B with the following property: 
for each point P in this region there is a unique extremal passing through 
it. If the variational integrand contains derivatives up to order n, then the 
extremal at P will have an nth order derivative at P that is a function of the 
values of x, y' , y", ..., y( n ~ 1 '> there. Field-theoretic questions were an impor- 
tant part of Zermelo’s theory and would become the focus of much further 
work in the calculus of variations. 



4. Further discussion of Zermelo’s theory 

The variable endpoint formula (14) plays an essential role in the derivation 
of the condition E(x,y,y',y",p,q) > 0. In (14) the increments Sx, Sy and 
Sy' are small increments in x , y and y' . It is immediately clear that the slope 




22 



Craig G. Fraser 



of any comparison curve may only differ from the slope of the actual solution 
curve by a small amount. Thus in the Weierstrassian theory the case n = 2 is 
essentially different from the case n = 1, where the slope of the comparison 
curve may differ by any finite amount from the slope of the solution curve. 
In fact, the restriction on the slope of the comparison curve in the case n = 2 
is the same as in the case of weak extrema for n = 1! In the general case in 
which the integrand function / contains derivatives of y up to order n, the 
comparison curve must differ by only a small amount at its derivatives up to 
order n— 1. Of course, the derivatives of order n and higher may take on any 
value, so it is clear that the class of comparison curves is still larger than in 
the case of weak extrema. 

There are several aspects of Zermelo’s theory that somewhat limited its 
influence on the later development of the calculus of variations. His formu- 
lation using a parametric approach seems to have stemmed from a desire to 
remain faithful to Weierstrass’s original exposition. However, the parametric 
formulation really constitutes a special topic, valuable from a certain geo- 
metric viewpoint but much too awkward to form the primary basis of the 
subject. Another important event was Hilbert’s introduction (1900a, 1905) 
of the invariant integral, giving rise to an essential tool that transformed 
the theory. As we show below, the use of the invariant integral simplified the 
derivation of Weierstrass’s theorem and provided a tool that could be applied 
to more general problems. 



Mention should also be made of the central problem of concern to Zer- 
melo. Although the variational problem with higher-order derivatives had 
been very prominent in the writings of Jacobi and his successors, it virtually 
disappeared from the textbook literature in the twentieth century. Instead 
one developed the theory for n dependent variables with variational inte- 
grands that contain only the first derivatives of the variables. The general 

f b 

variational integral is here / f(x,yi,y 2 ,...,yn,yi,y 2 i---,y'n)dx. The inves- 

J a 

tigation of sufficiency is carried out in this setting. The case of higher order 
derivatives is then treated as an optimization problem subject to constraint. 
The basic idea goes back to Clebsch 1858a, b and is illustrated by the problem 

. ... f b 

of minimizing / f(x, y, y' , y ) dx. This problem can be reformulated as the 

J a 

f b 

problem of minimizing the integral / f(x,y\,y 2 ,y' 2 )dx subject to the side 

J a 

constraint y( — y 2 = 0. Using the multiplier rule this problem is equivalent 

r b 

to minimizing the integral / (f(x, yi, y 2 , y' 2 ) + M x )(y[ ~ 2 / 2 )) dx. The Euler 

J a 



equations for this problem are — 

oyi dx 



df d(A(x)) df d (oy' 2 ) 



= 0 and — X(x)— - 

oy 2 dx 



= 0 . 



Noting that y[ — y 2 = 0 we find that these two equations reduce to 




Introductory note to 1894 



23 



df d df d 2 df n , „ , . , 

— —— + — — = 0, the Euler equation for 

oy i dx oy ^ dx z oy^ 

>6 



f(x,y,y' ,y")dx 



with y = j/i. Sufficient conditions for the problem / f(x, y, y , y") dx are in 

J a 

turn deduced from the general theory of sufficiency developed for the integral 

ft, 

/ f(x, 2/1 , V 2 , ■■■, Vn-, V\, 2 / 2 ) y n ) d x and applied to the particular integral 



(f{x,yi,y2,y' 2 ) + A(a :)(y[ - 2 / 2 )) dx. 



Despite these limitations, Zermelo’s dissertation was important in bring- 
ing Weierstrass’s ideas forward in published form and in developing the the- 
ory in new directions. It provided a source for the work of Kneser, Hilbert, 
Mayer (1904), Osgood ( 1900/1901 , 1901a) and Bolza as well as the other re- 
searchers of the period. The work is cited no less than eight times in Bolza’s 
Lectures on the calculus of variations (1904), on PP- 9, 35, 72, 76, 82, 119, 
143, 174. Special note should be made of Bolza’s discussion (p. 174) in Chap- 
ter V of transversals to sets of extremals, where attention is called to a result 
proved by Zermelo (p. 96 of his dissertation) concerning the envelope of a set 
of extremals. 

In the history of the calculus of variations there are examples of re- 
searchers who began in this branch of mathematics and continued to make im- 
portant contributions to it throughout their career. One might mention here 
such figures as Lagrange, Mayer and the American mathematician Gilbert 
Bliss. However, Zermelo belongs to another historical pattern of investigators 
who cut their teeth in the calculus of variations and then went on to promi- 
nence in very different fields of research. One could mention in addition to 
Zermelo (set theory) such figures as Charles Delaunay (celestial mechanics), 
Clebsch (algebraic geometry), Husserl (philosophy), and Herman Goldstine 
(computer science and numerical analysis). 



5. Epilogue: Hilbert’s invariant integral 

Weierstrass’s theorem in parametric form is given by (20). This statement is 
evidently relative to the particular parameterization chosen. Let us assume 
that we write the theorem in a form that is independent of any particular 
parameterization. One obvious way to do this would be to develop the theory 
in traditional ordinary form, using x as the independent variable and y as 
the dependent variable. In ordinary form Weierstrass’s theorem is written: 

rx 2 

AI= E(x, y, y',p) dx (22) 

J X\ 



E = f(x, y, y’) - f(x, y , p) - (y’ - p) 



df_ 

dy’ 



(x,y,p) ■ 



where 



(23) 




24 



Craig G. Fraser 



We have 



(f(x, y , y') - f(x, 2/0 j Vo)) dx , 



where yo = z/o (ic) is the extremal joining the initial and final points. From 
(22), (23) and (24) it follows that 

rx 2 

/ (f(x,y,y') ~ f(x,y 0 ,y' 0 )) dx (25) 

J X\ 

r x 2 Qf 

= J {f(x,y,y') - f{x, y,p) - (y r -p)—(x,y,p)) dx , 



fX 2 rx 2 QX 

/ f(x,yo,y'o)dx = / (f{x,y,p) + (y' ~p) — (x,y,p))dx. (26) 

J X 1 

Because y = yo{x) is given, the quantity on the left side of (26) is constant. 
Hence from (26) we deduce that the integral 

rx 2 QX 

H = J (f( x ’V’P) + (y 1 -p)-^( x ^y>p)) dx 

has the same value for all comparison curves y = y(x): the integral H is 
invariant with respect to the path. 

Hilbert did not discuss how he arrived at the idea of the invariant in- 
tegral: in his account it is something that is introduced without any expla- 
nation. However, it is reasonable to suppose that he first came across the 
idea by simply writing down Weierstrass’s theorem in ordinary form, and 
noticing as we did above that the integral H is invariant. It was then a sim- 
ple matter to show directly that H is invariant. Using the Euler equation 
df d d fix y zi) 

— — — - — - — = 0 it is straightforward to prove that 

ay dx op 







Introductory note to 1894 



25 



and so the condition for the integrability of the differential form (f(x, y,p) — 
£) f () f 

p^-(x,y,p)) dx + -4—{x,y,p)dy is satisfied. Having established that H is 
op op 

invariant directly we can then use this fact to provide a new proof of 
Weierstrass’s theorem, which is what Hilbert did. A significant advantage 
of Hilbert’s approach is that a wider class of fields can be used in the suf- 
ficiency proof. In the theory of Weierstrass and Zermelo, the extremals of 
the field pass through a single point: in the case of Weierstrass this point 
is the initial point of the extremal, and in the case of Zermelo it is a point 
very close to the initial point (see Bolza 1904, 82, note 1). Such a field is 
said to be a central field. By contrast, the proof of Weierstrass’s theorem 
using the invariant integral applies to any covering of a region surrounding 
the solution curve by a family of extremals in which only one extremal passes 
through each point of the region. The invariant integral can also be applied 
to more general variational problems, and is an important field-theoretic tool 
in the investigation of extrema. (For later literature related to this subject, 
see Hadamard 1910 , Bliss 1925 and Bliss 1946. The relevant history may be 
found in Thiele 2007.) 

In the publication of his Paris address of 1900, where Hilbert first pre- 
sented the idea of the invariant integral, he referred to Kneser’s Lehrbuch but 
not to Zermelo. However, we know that he held Zermelo’s work in the calculus 
of variations in high regard. In 1903 he recommended Zermelo for a position 
at the University of Breslau, writing 1 “Zermelo is a modern mathematician 
who combines versatility with depth in a rare way. He is an expert in the cal- 
culus of variations (and working on a comprehensive monograph about it). 
I regard the calculus of variations as a branch of mathematics which will be- 
long to the most important ones in the future.” Hilbert added that some years 
earlier, “Zermelo was my main mathematical company, and I have learnt a 
lot from him, for example, the Weierstrassian calculus of variations.” Zermelo 
was not offered the position. 



1 In a letter to the hiring committee; cf. Ebbinghaus 2007, 35-36, 276-277. 




Untersuchungen zur Variations-Rechnung 

1894 



Die Grundlage der nachstehenden Untersuchungen bilden die Vorlesungen 
des Herrn Prof. Weierstrass iiber „Variations-Rechnung“, die, wahrend einer 
Reihe von Jahren an der Berliner Universitat gehalten, mir zuerst im Sommer 
1892 durch einige Ausarbeitungen im Besitz des „Mathematischen Vereins“, 
zunachst aber durch eine Vorlesung des Herrn Prof. H. A. Schwarz ihrem we- 
sentlichen Inhalte nach zur Kenntnis gekommen sind. Meine Arbeit wird von 
dem Bestreben geleitet, einen Teil der von Herrn Weierstrass neu entwickel- 
ten strengen Methoden, die sich auf Maxima und Minima von Integralen der 
einfachsten Form beziehen, zunachst ohne Beriicksichtigung von Nebenbe- 
dingungen und in wesentlich theoretischem Interesse, auf den allgemeineren 
Fall auszudehnen, wo die Function unter dem Integralzeichen Ableitungen be- 
liebig hoher Ordnung enthalt. Diese Verallgemeinerung bezieht sich auf die 
Darstellung der Curven durch einen Parameter, auf die Durchfiihrung einer 
strengen Definition des Maximums oder Minimums und auf die Anwendung 
der von Herrn Weierstrass eingefiihrten Function E zur Auffindung notwendi- 
ger und hinreichender Bedingungen, wahrend die auch in jenen Vorlesungen 
untersuchte „zweite Variation 11 hier grundsatzlich ausser Betracht geblieben 
ist. Dagegen ist die wertvolle Arbeit Ludwig Scheeffers „Die Maxima und Mi- 
nima der einfachen Integrale zwischen festen Grenzen“ (Math. Ann. XXV), 
die bei sehr ahnlicher Betrachtungsweise doch erhebliche Verschiedenheiten 
der Methode aufweist, hier ohne wesentlichen Einfluss gewesen. 



Erster Abschnitt. 

Uber die Bedingungen, denen die Function unter dem 
Integralzeichen geniigen muss. 

Die Untersuchungen von Herrn Weierstrass beziehen sich auf Integrale 
der Form: 



*2 

J = J F(x,y;x',y') dt 

tl 





wo x = y = ip{t) als die laufenden Coordinaten eines Curvenstiickes 

aufgefasst werden konnen, iiber welches die Integration zu erstrecken ist. 




Investigations in the calculus of variations 

1894 



The following investigations are based on courses of lectures on the “calcu- 
lus of variations” given by Prof. Weierstrass at the University of Berlin for a 
number of years. I originally became aware of their basic ideas in the summer 
of 1892 through several sets of notes in the possession of the “Mathematischer 
Verein”, but first through a lecture by Prof. H. A. Schwarz. In my work, which 
is essentially guided by theoretical interest, I seek to extend some of the new, 
rigorous methods developed by Mr. Weierstrass concerning the maxima and 
minima of integrals of the simplest form to the more general case where 
the function under the integral sign contains derivatives of arbitrary order 
without, at first, taking into consideration ancillary conditions. This gener- 
alization concerns the representation of the curves by means of a parameter, 
the establishment of a rigorous definition of the maximum or the minimum, 
and the application of the function E introduced by Mr. Weierstrass in order 
to find necessary and sufficient conditions, whereas the “second variation” also 
investigated in those lectures will not, in general, be considered here. On the 
other hand, the valuable work “Die Maxima und Minima der einfachen Inte- 
grale zwischen festen Grenzen” by Ludwig Scheeffer ( Scheeffer 1885), which, 
for all the similarities in perspective, uses a very different method, has had 
little influence on the present work. 



First section. 

On the conditions to be met by the function under the 

integral sign. 

The investigations by Mr. Weierstrass are concerned with integrals of the 
form 



1 2 

J = j F(x,y;x',y')dt 

tl 




dx , dy\ 
dt ’ ^ dt J ’ 



where x = y = ip(t) may be considered the current coordinates of a 

curve segment along which the integration is to be taken. 




28 



Zermelo 1894 



Soil dieses Integral von der in gewisser Beziehung willkiirlichen Art, wie 
die Coordinaten der Curve als Functionen des Parameters t dargestellt sind, 
unabhdngig sein, so muss, wie gezeigt wird, F in Bezug auf seine beiden letz- 
ten Argumente x ' , y' homogen von erster Dimension sein, oder, was dasselbe 
sagt, der Bedingung geniigen: 



dF , OF , _ 

dx ,X + dy ,y ~ 



woraus weitere Forme In entspringen. 

Zur analogen Untersuchung des allgemeineren Falles: 



*2 

J = J F (x,x',...x { - n) ;y,y',...y {n) ^j dt 



haben wir zunachst nach den entsprechenden Bedingungsgleichungen fiir die- 
se allgemeinere Function F zu fragen. 

Zur Herleitung derselben werde ich mich, da hier die direkte Substitution 
eines anderen Parameters nicht mehr zweckmassig scheint, in ahnlicher Weise 
der Variationsrechnung selbst bedienen, wie man es seit Euler zur Aufstellung 
der gewohnlichen „Integrabilitats-Bedingungen“ zu thun pflegt. Doch wird 
diese Anwendung eine rein formale sein und keine anderen Principien als die 
Elemente der Differential-Rechnung zu Grunde legen. 

F = F(xM,yM') 

(in abgekiirzter Schreibweise) werde als eine analytische Function ihrer samt- 
lichen Argumente vorausgesetzt, die in clem ganzen betrachteten Bereiche 
den Charakter einer ganzen Function, also auch partielle Ableitungen belie- 
big hoher Ordnung besitzt. 



x = <p(d) , y = ip(d) 

seien eindeutige und mit ihren r ersten Ableitungen stetige Functionen von d 
im Intervall 

0i g d g d 2 , 

wobei fiir den einen Teil der folgenden Betrachtungen die Annahme r = n, 
fiir den anderen erst r = 2 n geniigt; ausserdem mogen und an 

keiner Stelle des Inter valles gleichzeitig verschwinden. 

Es wird dann der Punkt x , y ein zusammenhangendes Curvenstiick 1 2 in 
einem Zuge und in bestimmtem Richtungssinn beschreiben, wahrend d im- 
mer wachsend das Intervall . . . i ?2 durchlauft; fiir eine beliebig gezeichnete 
rectificierbare Curve kann z. B. immer d = s gesetzt, die Bogenlange zur un- 
abhangigen Variablen gewahlt werden. Zu jedem Werte d zwischen und d 2 
gehort ein bestimmter Punkt ip(d), il>{d) des Curvenstiickes und umgekehrt 




Investigations in the calculus of variations 



29 



If this integral is to be independent of the, in certain respects arbitrary, 
manner in which the coordinates of the curve are represented as functions of 
the parameters t, then, as shall be shown, F must be homogeneous of first 
order in its last two arguments x ' , y ' , or, what amounts to the same, it must 
meet the condition 



dF , OF , _ 

x + —y = F 



dx' dy‘ 

from which further formulas arise. 

To investigate the more general case 



J = 




dt 



along similar lines, we first must address the question of the corresponding 
constraint equations for this more general function F . 

In order to deduce them I shall make use of the calculus of variations in 
a way similar to how it has been used to determine the usual “integrabil- 
ity conditions” ever since Euler, since in this case the direct substitution of 
another parameter no longer seems practical. However, this application will 
be a purely formal one and not require any other principles but the basic 
elements of the differential calculus. 

F = F 

(in abbreviated notation) is assumed to be an analytic function of all of its 
arguments that is of the character of an entire function over the whole domain 
under consideration, and hence also has partial derivatives of arbitrary order. 
Let 

x = ip(d) , y = tp{d) 

be single-valued functions of '0 that, together with their first r derivatives, 
are continuous on the interval 



0i g d ^ d 2 , 

where, for one part of the following considerations, it suffices to assume that 
r = n, and for the other, only that r = 2 n; furthermore, we assume that 
Lp' (d) and ip'(d) do not vanish simultaneously anywhere in the interval. 

In this case, the point x, y traces a continuous curve segment 1 2 in one 
fell swoop in a specific direction as the ever-increasing d passes through the 
interval . . . 1 ) 2 ; given an arbitrarily drawn rectifiable curve, we can, e.g., 
always set d = s and choose the arc length as the independent variable. To 
each value d between and d 2 there belongs a particular point <p(i9), if>(d) 
of the curve segment, and, conversely, to each of these points a particular d 




30 



Zermelo 1894 



zu jedem dieser Punkte ein gewisses i9 des Intervalles. — Wir konnen nun 
i9 = i9(t ) als eine beliebige im Intervall mit ihren ersten r Ableitungen stetige 
und bestandig wachsende Function von t, oder, was dasselbe ist, t als eine 
eben solche Function von i? annehmcn und dann die Endwerte t\ , £2 bestimmt 
denken durch = i?i, $(£ 2 ) = $ 2 - Wenn dann t das Intervall t\ . . . £2 zu- 
nehmend durchlauft, so muss auch 1 ) immer wachsen von $1 bis $2 und das 
Curvenstiick 1 2 erzeugen. 

4 | x 9 = t ergiebt die urspriingliche Darstellung der Curve: 

x = ip(t ) , y = xp(t) , 

x) = x)(t) aber eine andere: 

x = ip($) = Tp(t ) , y = = xp(t ) ; 

und von dieser Form miissen auch alle den gemachten Voraussetzungen 
entsprechenden Darstellungen 7p(t),ip(i) unseres Curvenstiickes sein: immer 
giebt es eine solche Function $ = x 9(t), welche die Uberfiihrung bewirkt. 

Soli nun fiir ein beliebig vorgeschriebenes Curvenstiick 1 2 

x = </?($) , y = V’('d) (i9i ^ i9 ^ x9 2 ) 

unser Integral 



*2 




^1 



einen bestimmten, von der besonderen Form der Darstellung unabhangigen 
Wert besitzen, so darf es nach Fixierung der Functionen ip, xp nur von den 
Endwerten $1 und $2 der Function $ = $(i) abhangen, also bei constantem 
Anfangspunkt $1 nur von dem variablen Endpunkt 1 ) 2 - Es muss also, wenn 
man 1 ) 2 , £2 durch i9, t ersetzt: 

t 

dt = J{&) 

tl 

eine blosse Function von 1 ? sein, und durch Differentiation nach t: 

= D J(0) = (1) 

fiir eine beliebige, nur den Stetigkeitsbedingungen und der Bedingung •&' > 0 
geniigende Function d = x 9(t), also auch fiir d = t, •d' = 1, 

D'V(tf) = , D^xp^Si) = ipM(&) , 





5 | so dass wegen 







Investigations in the calculus of variations 31 

of the interval. — We now may assume d = d(t) to be an arbitrary function t 
that, together with its first r derivatives, is continuous and increasing on the 
interval. Or, what amounts to the same, we may assume t to be a function 
of d of this sort, and the end values ti, O to be determined by d(t i) = i?i, 
r d(t2) = d 2 • If, then, t passes through the interval t\ . . . as it increases, then 
d, too, must increase from d\ to 1)2 and generate the curve segment 1 2 . 
d = t yields the original representation of the curve: 

x = <p{t) , y = ip(t) , 

d = d(t), however, a different one: 

x = <p(&) = Tp{t ) , y = ip(d) = ip(t ) ; 

and this is the form that all representations Tp{t),ip(t) of our curve segment 
that satisfy the stated assumptions must have: there is always a function 
d = d(t) that achieves the transformation. 

Now if, for an arbitrarily prescribed curve segment 1 2 

x = ip(d) , y = V’O?) ($1 ^ d ^ d 2 ) , 

our integral 

ti 

J = J F(x^\y { ^ dt 

tl 

is supposed to have a particular value independent of the specific form of 
representation, then, once the functions tp, ip have been fixed, it may only 
depend on the end values 1 ) 1 and $2 of the function d = d(t), and hence only 
on the variable endpoint $2, assuming a constant starting point $1. Thus, 
if $2> t r 2 are replaced by d, t, 

t 

»v{d),D»ip(d)) dt = J(d) 

tl 

must only be a function of d, and, by differentiation with respect to t, 

F(D»v(d), D^ipid)) = D J(d) = J'(d)d' ( 1 ) 

for any function d = d(t) satisfying only the continuity conditions and the 
condition d 1 > 0 , and hence also for d = t, d 1 = 1 , 

D^ipid) = <pM(d) , D^ip(d) = ip^\d) , 

so that on account of 






F (v^\d),ip^\df) = J'(d) 




32 



Zermelo 1894 



nunmehr (1) geschrieben werden kann: 

F(D> t <p(0),Di t il>(0)) = F & . (la) 

Die Differentiationen aber kann man ausfiihren: 

x' = Dip^) = (p'('d)'d' , x" = <p" ('d)'d' 2 + ip' {'&)'&" , 
u. s. w., allgemein 

xM = D»'p(Q) = 

abgekiirzt = , (2) 

yW = D^{&) = R„ • 

Hier ist R p eine ganze rationale Function ihrer samtlichen Argumente 
mit ganzzahligen positiven Coefficienten, linear und homogen in Bezug auf 
die ersten //, und enthalt die hochste Ableitung nur in dem einen Gliede 

Man kann aber immer n+ 1 willkiirliche Grossen d, d" , . . . ansehen 

als die Werte der successiven Ableitungen einer stetigen Function •& = i?(t) 
fiir einen beliebigen Argumentwert t = t'; wir brauchen ja nur zu setzen 

Im Falle •&' > 0 wircl jede solche Function in einer gewissen Umgebung von 
t = t' mit t immer nur zunehmcn und somit alien an i9 gestellten Forderun- 
gen geniigen. Daher wird die Gleichung (1), die ja fiir beliebige Functionen 
i9(t) gelten sollte, nach Einsetzung der Ausdriicke (2) identisch bestehen fiir 
unbestimmte und unter einander unabhangige Grossen d, mit der 

6 | einzigen Bedingung •&' > 0, so dass 




( 3 ) 

( 4 ) 




Investigations in the calculus of variations 



33 



we now can write (1) as 

F{D >i p{d),D> x ip{d)) = F(ip^\d),^\d) s j d' . 

But the differentiations can be carried out: 

x' = Dip (d) = p'(d) d' , x" = p"{d)d' 2 + p'(d)d" , 
e. t. c., generally, 

= D»p{d) = R M (p(d),p'(d),...pM(d),d', f ..d^) 
abbreviated = (p^ v \d),d^) , 

yM = D»ip(d) = R^ • 

In this case, R M is an integral rational function of all its arguments with 
positive integral coefficients that is linear and homogeneous with respect to 
the first /_/, and contains the highest-order derivative d-'b only in the term 
p'(d)d^\ 

But it is always possible to consider n + 1 arbitrary quantities d. d' . d" , . . . 
d^ as values of successive derivatives of a continuous function d = d(t) for 
an arbitrary argument t = t'\ for we only need to set 

*(*) = £ — 

M = o ^ 




When d' > 0, any function of this kind always only increases in the 
vicinity of a certain t = t' along with t, thus meeting all requirements imposed 
on d. Therefore, upon substitution of the expressions (2), the equation (1), 
which was supposed to hold for arbitrary functions d(t), holds identically for 
indeterminate, mutually independent quantities d, d' , . . . d^ n \ under the sole 
condition that d' > 0, so that 



* = j, = j'w = m 

is independent of d', d", . . . d^ n \ and hence 

dP n . 1 „ , dp dP DP 

~ ’ dd ~ dd ~ 

and furthermore 



6 {) P = V j^Sd^ = ^Sd = DP^ = DP . 7 



fi — 0 



( 3 ) 

( 4 ) 



where Sd denotes an arbitrary function of t that, together with its first n 
derivatives 



D^Sd = Sd M , 




34 



Zermelo 1894 



stetige Function von t bedeutet; auch 



W 



ist solch eine willkiirliche Function. 
Es ist aber allgemein: 



n 

6* f = Y, 



df 



fi — 0 









—f(D + e6#,D , +e6'&',...) 
OS 



J £ = 0 



„die erste Variation von / in Bezug auf $ u und folgt den Gesetzen: 
5H(f 1 ,f 2 ,...) = ^-6f 1 + ^6h + ... 

Of 1 0/2 



Daher ist einerseits: 



dF 



fj — O "~n <■ 



SD^f = D^Sf . 
f dF 



fj. = o 



dx M 



5x^ + 



dy(v) 



da 



5$ 

Sx = Sift'd) = <p'('&)6‘& = <p , ('&)‘d , — 

v' 



= Dipfd).T = a/r 



( 5 ) 



und ebenso: 



(5y = y'r 



ist, andrerseits aber wegen (4) 

5$F = S${<!>'d') = S^.'d' + <£<W' 
= D’P.T'd' + $D(d'r) 

= D{^'t) = D(Ft) , 



also: 



6,F=Y\ 

n=o 



r of 



dxtF) 



(. x't + 



dF 

dytF) 



(yV)^} = D (Ft) . 



(6) 




Investigations in the calculus of variations 



35 



is continuous; also 

r = 

is an arbitrary function of this kind. 
But, generally, 



6d 

~¥ 



n 



df 



fi — 0 



dd(v) 



Sd M 



(5) 



— / (d + eSd, d' + eSd ', . . .) 



J e = 0 



is “the first variation of f with respect to d” and is subject to the laws 

SH(f 1 ,f 2 ,...) = ^-Sf 1 + ^Sf 2 + ... 
of i of 2 

SD^f = D»6f. 



Hence, on the one hand, 



II — 0 = 0 ^ if J 

Y { dF (a;V)M + fa' T )0b\ , 

ch/tO J 



since 



li = o 



Sx = Sift'd) = ip (d)6d = p'(d)d' — 

d' 



= Dip(d).r = x't 



and likewise 



Sy = y'r , 

while, on the other hand, 1 

StfF = 8${$d') = 5i)@.d' + 

= DQ.Td' + <PD(d'r) 

= DfiFd'r) = D(Ft) , 

on account of (4), and hence 

5 » F = E {^) ^ r ) (M) } = D ^ ■ ( 0 ) 

1 [[In the first line of the following formula, Zermelo erroneously writes “<5” for the 
fourth “<V’-][ 




36 



Zermelo 1894 



Diese Beziehung muss bestehen fiir beliebige Functionen x , y , r von t, 
da ja auch die Functionen p> und ip willkiirlich sein sollen. Wir brauchen 
claher nur nach formaler Ausfiihrung der Differentiationen die Coefficienten 
cler r, t' , . . . beiderseits einander gleich zu setzen, um das System der von 
den unabhangigen Bedingungsgleichungen fiir F zu erhalten. 

Diese Bedingungen sind zugleich auch hinreichend fiir das Bestehen der 
verlangten Eigenschaft. Denn sind sie erfiillt, so gelten auch (6), (4) fiir 
willkurliche ip, ip, i), r oder Si9 und daher, durch Coefhcientenvergleichung, 
auch (3). Dann lasst sich fiir irgencl zwei Functionen p,ip auch imrner J' (i9) 
der Gleichung (1) gemass bestimmen, und durch Integration folgt schliesslich: 

ti 

J dt = J(0 2 ) - J(V i) , 

^1 

8 | in der That nur abhangig von der durch ip, ip bestimmten Form der Curve 

und ihren durch d\ und $2 bestimmten Endpunkten fiir die Integration, aber 
unabhangig von der Beziehung zwischen t und d, cl. h. von der besonderen 
Darstellungsform. 

Um nun die Bedingungsgleichungen noch in andrer Gestalt nebst den zwi- 
schen beiden bestehenden Beziehungen darzustellen, schicke ich einige allge- 
mein giiltige formale Entwicklungen voraus, in denen zur Abkiirzung 

dF OF 

dx(F) ^ ’ dy(F) M 

geschrieben werde. 

Nach einer in der Variationsrechnung allgemein gebrauchlichen Umfor- 
mung durch partielle Integration ist fiir eine beliebige Function F(x^ ,y^) 
wenn x, y und r irgendwelche Functionen von t bedeuten: 

n n 

Y X„(x't)M =Px't + dY j , (7) 

fj, — 0 fi — 1 

wenn 



Xu — + D P ^ _|_ i (/i — 0, 1, . . . n) (8) 

(_P n _|_ i — 0 , cllsO P n — X-ri) 

n — 

P„ = XI (“ 1 ) XDXX » + x (/x = 0, 1, . . . n, P 0 =P). 

x = o 

Durch Entwicklung der (x't)^ und Zusammenfassung gleichnamiger 
Glieder geht (7) iiber in: 

n n 

Y = 77r + D Y n ^ v " 1} - 

is = 0 v—\ 



(9) 




Investigations in the calculus of variations 



37 



This relationship must hold for arbitrary functions x, y , r of t, since the 
functions ip and ip are supposed to be arbitrary as well. After having formally 
carried out the differentiation, we therefore only need to set the coefficients of 
the t, t' , . . . t equal to one another on both sides in order to obtain for F 
the system of the constraint equations independent of the . 

At the same time, these conditions are also sufficient for the required 
property to obtain. For if they are satisfied, then (6), (4) also hold for ar- 
bitrary tp, ip, d, t or Sd, and hence, by virtue of comparison of coefficients, 
(3) holds as well. For any two functions <p, tp, it is then always possible to 
determine also J '(d) in accordance with the equation (1), and by integration 
it eventually follows that 

*2 

J F(D»ip(d), D*ip{$)) dt = J(d 2 ) - J( 0 i) , 

which really only depends on the shape of the curve, determined by ip, ip, and 
on its endpoints, determined by d\ and 1)2, for the integration, but which is 
independent of the relation between t and d, that is, of the particular form 
of representation. 

In order to represent the constraint equations also in another form besides 
the relations obtaining between the two of them, we will first consider several 
generally valid expansions, using the abbreviations 

OF OF 

dx^F) M ’ dy^F) p 

According to a transformation by partial integration commonly used in 
the calculus of variations, we have, for any function F{x^F) , yf/d), where x, y 
and r denote arbitrary functions of t, 

n n 

Y, X^x't)^ = Px't + DY P»{x't)^-0 , ( 7 ) 

pi — 0 pi = 1 

if 



Xfi = Pn + D Pp + 1 (/Z = 0, 1 ,...n) (8) 

( P n 1 0 , hence P n — At,, j 



n — pi 

P a = Y(-V XDXX » + x (M = 0,1 P 0 = P). 

x = 0 

By expanding the (x't)^ and collecting like terms, (7) is transformed 
into 

n n 

Y F v t { v) =IIt + dY n ^ {v ~ 1} - 

11 = 0 11 = 1 



(9) 




38 



Zermelo 1894 



wenn 



n 



■=!/ = 


E (e)^ (M ^ +1) 


(10) 




fl, = IS 




II 

1 1 


) , n = n 0 = p x ' 


(11) 


/J, = is 








(v= 1,2, ...n) . 





Aus (9) aber folgt durch Vergleichung der Coefficienten von r links und 
rechts, was sich auch durch direkte Benutzung von (8), (10), (11) nachweisen 
liesse: 

S v = FI V + D II v + i {v = 0, 1, . . .n) (12) 

( F n -j- l — 0, also IJjy — ^i/) , 

daher: 

n — is 

n v = ^(-l)^^ + x . 

X = 0 

Die analogen Beziehungen bestehen, wenn man die Ausdriicke 
t (m) X P ~ TT 
der Reihe nach ersetzt durch: 

7/ (m) Y O H P 

y 7 J /X7 ^//j.7 J± IS’> 1 IS 7 

und durch Zusammenziehung der entsprechenden Gleichungen erhalt man 
aus (7), (9) und (12): 

n 

S# F = + W^) (/i) } (7a) 

fi — 0 

n 

= (. Px ' + Qy') T + DJ2 “ 1} + Q^y'r)^ ~ 1} } 

M — 1 
n 

= Y / {S v +H v )t^ (9a) 

IZ = 0 

n 

= {ii + p)t + dY / (ii v + p v )t<‘ v - 1 '> 

is = 1 

10 I ~ v + H l/ =n v +P v +D(II l/ + 1 +P v + 1 ) (12a) 

n — is 

n u + p u = Y,{-P) x d x {z v+x + h v+x ) 

x — 0 

(v = 0,l,2,...n) . 




Investigations in the calculus of variations 



39 









n 

n v = E (7S) p ^~ v+1) < n = n 0 = Px' (li) 

fi — iy 

(v= 1,2 ,...n) . 

But from (9) we obtain by comparison of the coefficients of on the 
right and left sides what could also be verified by direct use of (8), (10), (11): 

E v = 77„ + D 77„ + i (i/ = 0, 1, . . .n) (12) 

(Pn _|_ r — 0, also — — iy) , 

and therefore 



n v — £(~ 1) X ^ X ^ + X- 
x = 0 

The analogous relations hold when the expressions 
r (/d y P - 11 

^ -) 1 fl ) *— 'I'J 

are replaced by, respectively, 

i/W Y O H P 

Collecting the corresponding equations, we obtain from (7), (9) and (12): 

n 

5»F = J2 + Wc> ( m) } (7 

H — 0 

n 

= {Px' + Qy')r + D ^ {p^e't)^ " 1} + Q^v't)^ ~ 1} } 

M = 1 
n 

= ^(^+ff„)rM (9 

i/ = 0 

n 

= (ii + p)t + dY j {n v + p v )T^ v -^ 

v = 1 

— n u + .P^ + D (IIv _|_ i + P v _j_ i) (12 

n — v 

n„ + p„= Y j (~ 1 ) xdx (^+x + h, + x ) 

x = 0 

(v = 0,1,2 ,...n) . 




40 



Zermelo 1894 



Die bisherigen Beziehungen gelten, wenn X Y M beliebige Functionen 
von t sind. 1st aber: 



F = F 






x - dF 



Y =^L 



so ist nach (10): 



~ 0 + fli, = s + jt = {^ (M+1) +^y (M+1) } = ^, (is) 

/i = 0 



und weil nach (12a) 



n + P = E + H-D{n 1 +P 1 ) 

Px' + Qy' = n + P = D(F-IIi- Pi) . (14) 

Fur solche F aber, welche unserer Forderung, d. h. der Gleichung (la) 
genii gen, ist 

n 

5#F = J2 {X^x'r)^ FY^v't)^} (6) 

fi = 0 

n 

= Y (E v +H v )tM=D{Ft) , 

11 = 0 

oder wegen (7a) und (9a) 

n 

D(Ft) = (Px' + Qy')r + dY { P h( x ' t )^ ~ 1} + Qu,(y'T)^ (6a) 

M — 1 
n 

= (n + p) t + dY (n v + p v ) T ^~^ . 

t/=i 



11 | Es miisste also ( Px ' + Qy')r = (II + P)r fiir willkiirliches r = r(t) eine 

vollstandige Ableitung sein, was nur moglich ist, wenn 



n + P = Px' + Qy' = 0 , 



(15) 




Investigations in the calculus of variations 



41 



The previous relations hold when X^, Y lt are arbitrary functions of t. But 
if 



F = F(xM,yM) , X M 



dF _ OF 

dx M M ch/tO ’ 



then, by (10), 



E 0 + H 0 =~+H= J2 {^ (/i + 1) +^ (M + 1) } =DF, (13) 

fi = 0 

and since, by (12a), 

n + P = S + H- £>(77! + PO 
Px' + Qyf = n + P = D (P - Pi - Pi) . (14) 

But for F satisfying our demand, i.e. , the equation (la), 

n 

6#F = J2 {x»(x't) W + Wr)** 0 } (6) 

li — 0 

n 

= Y, {5 v + H v )tM=D(Ft) , 

is = 0 

or, on account of (7a) and (9a), 



n 

D(Ft) = ( Px' + Qy’)T + DY {P^'T^-V+Q^y'T^-V} (6a) 

M= 1 
n 

= {ii + p)t + dY c n„ + p v )r ( "- 1} . 

i 



Hence, (Pa/ + Qy')r = (II + P)t would have to be a total derivative for 
an arbitrary r = r(t), which is possible only if 



n + P = Px' + Qy' = 0 , 



(15) 




42 



Zermelo 1894 



sodass die vorhergehende Gleichung durch Integration tibergeht in: 

n 

E + (16) 

M — 1 

n 

= E (^ + P,)r ( ''- 1) =Fr, 

!^= 1 

wo die Integrationsconstante offenbar verschwindet. 

Durch Coefficienten-Vergleichung erhalt man daraus mit Hilfe von (10) 
und (11) 

n 

E u +H u = E (^){x ii x^ +1 %y^- v + 1 '>} =e v>1 F, (17) 

fl = V 

n 

n„ + p„= E (ri 1 ){V"~ , ' +1) +0/" , ' +1) }=M f (18) 

fi = n 

(v = l,2,...n) , 



wo, wie auch im Folgenden, 

1 (M — ^) 7 0 (M 7 ^ ^) 

die Bedeutung des Kroneckerschen Symbols (<5 Mj „) besitzt. 

Diese beiden Systeme von n Gleichungen stellen, jedes fiir sich allein, die 
vollstdndigen Bedingungen fiir die Erfiillung unserer Forderung dar. Denn die 
Coefficienten von r auf beiden Seiten von (6) stimmen nach (13) identisch 
iiberein, die von > 0) aber werden durch (17) zur Ubereinstimmung 

gebracht. Ferner entsteht (15) wegen (14) durch einfache Differentiation aus 
| der zu (18) gehorigen Gleichung: 



77i + Pi = F , 

sodass auch (18) gleichbedeutend ist mit (6a). Es folgt also aus (17) fiir den 
urspriinglichen, aus (18) fiir den transformierten Ausdruck jedesmal dieselbe 
Formel: 



5$F = D{F t) , (6) 

die auch als hinreichende Bedingung bereits nachgewiesen ist. 

Der unmittelbare analytische Zusammenhang dieser beiden Systeme (17) 
und (18) wird durch die Formeln (12a) gegeben, durch deren zweite man 
unmittelbar (18) aus (17), durch deren erste aber umgekehrt (17) aus (18) 
ableiten kann. 

Trotz dieser Aquivalenz wird man naturgemass das System (17) als die 
einfachere, urspriinglichere Form der Bedingungen ansehen miissen, aus wel- 
cher die andere (18) erst durch Differentiation hervorgegangen ist. Doch wird 




Investigations in the calculus of variations 



43 



so that, by integration, the previous equation is transformed into 

n 

E {p^'t^-V+QMt)^} (16) 

M = 1 

n 

= E (n u + p u ) T ^~v =Ft , 

where the constant of integration obviously vanishes. 

From this we obtain, by comparison of coefficients and with the help of 
(10) and (11), 

n 

E u + H u = E (^){x it x^- v + 1 %y^- ,/+1 ^ = e v , 1 F, (17) 

/X = V 

n 

n„+P u = E (uZl) {P^~ v + 1) + =e v ±F (18) 

(1 — V 

(is = 1,2, ...n) , 



where, as also in the following, 

6 ^ 1,1 j 1 (h ^) 5 6 (M 7^ 

has the meaning of Kronecker’s symbol ((5 Mj „). 

Each of these two systems of n equations represents the complete require- 
ments for the satisfaction of our demand. For the coefficients of r on both 
sides of (6) agree identically according to (13), while those of r ^ (p. > 0) are 
brought into agreement by means of (17). Furthermore, we obtain (15) on 
account of (14) by simple differentiation from the equation belonging to (18): 



Id i + P [ — F , 

so that (18), too, has the same meaning as (6a). Hence, the same formula 
follows from (17) for the original expression and from (18) for the transformed 
expression, namely 



5#F = D(Ft), (6) 

which has already been shown to be a sufficient condition as well. 

The direct analytic connection between the two systems (17) and (18) is 
given by the formulas (12a), the second of which allows for the immediate 
derivation of (18) from (17) and, conversely, the first of which, that of (17) 
from (18). 

This equivalence notwithstanding, we will of course have to consider the 
system (17) the more primitive and more natural representation of the con- 
ditions, from which (18) arises only by differentiation. But it is just the latter 




44 



Zermelo 1894 



gerade die zweite in Gestalt der Gleichungen (16) und 

n 

n i + Pi = Y, + Q»v w ) = F 



( 18 )i 



m = i 



bei den spateren Untersuchungen angewandt werden. 

Beide Systeme haben eine einzige Gleichung gemeinsam, namlich (v = n): 



X n x' + Y n y' = P n x' + Q n y' 



(17)„, (18) r 



dF 

dx( n ) 



dF , 

+ W* y =e ” a 



welche fiir n = 1 in der Form: 



dF , OF , _ 

dx' X + dy' V ~ 

als die in diesem Falle einzige Bedingung die Homogeneitat der Function 
F = F(x, y ; x' , if ) in Bezug auf x' und y' ausdriickt. 

Durch partielle Differentiation der allgemeinen Formel (17)„, (18) n nach 
x und y ergiebt sich: 



d 2 F 



d 2 F 



dx^dxd 1 ) dx( n )dy( n ) 



d 2 F 



d 2 F 



y' = o 

y = o , 



dx^dy^F) dy^dyd 1 ) 
sodass man, wie es spater geschehen wird, setzen kann: 
d 2 F „ „ d 2 F 



dxdddx^F) 



= y’ z f 1 , 
d 2 F 



wo 



Fi = 



dx ( n )i9j/( n ) 
d 2 F d 2 F 



dydddy^F) 
= -x’y’ F-\ , 



= x ,z F\ 



(19) 



dxdddxd 1 ) dy^dyd 1 ) 



:(x' 2 +y' 2 ) 



eine von x^\y^ (ft = 0,1,... n) abhangige Function ist, welche endlich, 
eindeutig und stetig bleibt, so lange die partiellen Ableitungen von F es sind 
und x',y' nicht beide gleichzeitig verschwinden. 

Der urspriinglichen Bedingungsgleichung (1) oder (la) kann man noch 
eine sehr gebrauchliche Form geben, wenn man in der Umgebung einer Stelle 
t = d, wo tp't'd) ^ 0 ist, eine Function d(t) bestimmt durch die Gleichung: 
t = y{'d) = x, also 

= e M ,i (^ > 0) ; D^{d) = , 




Investigations in the calculus of variations 



45 



which will be used in the form of the equations (16) and 



n i + Pi = J2 { p » xM + QnV M ) = F ( 18 )i 

f* = 1 

later in our investigations. 

Both systems have a single equation in common, namely (y = n): 



X n x' + Y n y' 



P n x' + Q n y' 
dF , dF 
dx ( n ) dy( n ) 



y' = e„,i F , 



which, for n = 1, in the form 



(17)n, (18)n 



dF , OF , _ 

dx' X + dy' V ~ 



expresses the homogeneity of the function F = F(x,y,x',y r ) with respect 
to x' and y', being the sole condition in this case. 

Partial differentiation of the general formula (17) n , (18)„ with respect to 
x and y ^ yields 



d 2 F d 2 F 

dx^dx^d dx^ r ddyl r d ^ 

d 2 F d 2 F 

dx( n ldy( n ) dy^ddy^) ^ 

so that we can put, as will be done later, 



where 



d 2 F 



= r , 



d 2 F 



dx^dxdd ’ dydddydd 

d 2 F 



= x' 2 Fi , 



dxd^dydP 



= ~x y F \ , 



Fi = 



d 2 F 



d 2 F 



dxd’ddx^'d ch/(”)ch/(") 



y.(x' 2 + y' 2 ) 



(19) 



is a function dependent on x^\y^ (y = 0,1,... n), which remains finite, 
single-valued, and continuous as long as the partial derivatives of F do and 
as long as x',y' do not both simultaneously vanish. 

We can also represent the original constraint equation (1) or (la) in a very 
common form by determining a function d(t) in the vicinity of a position t = d 
where <p' (i9) ^ 0 by means of the equation: t = </?($) = x, and hence 



= e M ,i (y > 0) ; 






dPy 
dx ^ ’ 




46 



Zermelo 1894 



wo 

dy_j/_ cPy x'y" - y’x” 

dx x’ ’ dx 2 x ,3 

u. s. w., allgemein: 

d»y _ (a:', a . .x M \y',y", . . .y^) _ ,S M {x^\y^) 

dx v x’ 2 ^ ~ 1 x ,2 ^ ~ 1 

gesetzt werden kann, wenn x ^ = D^ip^d), y^ = D^tp(d) bedeutet fur 
beliebige Beziehungen zwischen t und d, also auch fur d = t, x ^ = <^^)(d), 
j/(^) = t/;(^)(d). Setzt man nun die oben gefundenen Ausdriicke in (la) ein, 
so erhalt man: 



14 



F 




’ dx ’ 



dx n / \ ’ dx v 

= F(^)(d),V>^(d)) ^ . 



Nun ist, wieder nach (la), fur beliebiges d = d(t) 



F = F(FV(d),L>^(d)) = F (^(d),V (#i) (d)) d' 

/ d^y\ dx dd / d^\ , 



d. h. 



F 



(x (/i) ,2/ (/i) ) = F ^r,y 



dy 

dx’ 



d^ 

dx n 



(20) 



eine identische Beziehung, wenn man fur 



dx v 



ihre eben gefundenen Aus- 



driicke durch die x^\y^ einsetzt. 

Das angewandte Verfahren und die gefundenen Bedingungsgleichungen 
lassen sich unmittelbar auf den allgemeineren Fall einer grosseren Anzahl 
von Variablen x, y, z, . . . iibertragen. So wird z. B. ein Integral: 



*2 

J = J F^W.yW.zW) 



dt 



iiber ein Stuck 1 2 einer Raumcurve : x = tp(t), y = ip(t), z = x(t) erstreckt, 
einen von der besonderen Form der Darstellung, d. h. von der Wahl des Pa- 
rameters t unabhangigen Wert besitzen, wenn: 



zz v + H v + Z v — 77 „ + P v + T v — e v> \F 
{v= 1 , 2 , ...n) , 



(17), (18) 




Investigations in the calculus of variations 



47 



where it is possible to put 

dy_i/_ cPy x'y" - y'x" 

dx x' ’ dx 2 x ' 3 

e. t. c., generally, 

dFy _ Sy(x',x",...xM;y',y",...yM) S^(x^,y^) 

dx ^ x , 2 t* - 1 x ,2 ^ - 1 ’ 

if x ^ is and y^) is D IJ, %f{d) for any relation between t and d, and 

hence also for d = t, x ^ = (p^\d), y ^ = i/j^(d). Substituting the expres- 
sions found above into (la), one obtains 



F 




dy d n y \ 
dx ’ dx n J 



= F 



(x 

\ ’ dx ^ J 



= F^\d),^\d)) 



We now have, again on account of (la), for arbitrary d = d(t), 



F(x^\y^ =F{D^ip{d),D^'il){d)) = F ^\d),ip^\d)^ d' 

= F ( x . d A = F ( x ^v\ x > 

\ ’ dx^ ) dd dt \ dx^ J 



i.e., 




(20) 



y 

an identical relation if the — — are replaced by the expressions in x^\y^ 

dx ^ 

just found for them. 



The method employed and the constraint equations found can be im- 



mediately applied to the more general case of a greater number of variables 
x, y, z, . . .. For instance, the integral 



J = 




dt 



taken along a segment 1 2 of a space curve : x = y(t), y = if{t), z = x(t), 
possesses a value independent of the particular form of representation, i.e., 
the choice of the parameter t, if 



— v + H v + Z v — 77„ + P v + T u — e Vi \F 
{v = 1,2,... n) , 



( 17 ), ( 18 ) 




48 



Zermelo 1894 



wo die Ausdriicke Z U ,T U ebenso nach z gebildet sind wie E v , 17„; H u , P u 
nach x und y. 

Die Untersuchung ist hier mit grosserer Vollstandigkeit gefiihrt worden, 
als es fur die unmittelbare Anwendung auf das Problem der Variationsrech- 
nung notwendig gewesen ware; die Frage ist als eine Aufgabe von selbstandi- 
15 gem Interesse aufgefasst | worden, die bisher eine ausreichende Beantwortung 
noch nicht gefunden zu haben scheint. 

Zuerst, soweit ich in Erfahrung bringen konnte, stellt sie Lagrange in 
seinen „Legons sur le Calcul des Fonctions 11 in dem der Variationsrechnung 
gewidmeten Abschnitte und zwar in der hier durch (20) gegebenen Form, und 
er beweist hier die Notwendigkeit der einen Gleichung: 



Px' + Qy' = 0 (15) 

(in meiner Bezeichnung), ohne sich iiber die hinreichenden Bedingungen zu 
aussern. Nach ihm gelangt Poisson in seinem „Memoire sur le Calcul des Va- 
riations“ auf ahnlichem Wege zu demselben Ergebnis, halt aber, wie er aus- 
drticklich erklart, diese eine Bedingung zugleich fiir hinreichend, ein Irrtum, 
den auch Todhunter (History on the Progress of the Calculus of Variations) 
in seiner Besprechung der Poissonschen Schrift wiederholt hat. 

Dass (15) in Wirklichkeit fiir keinen Wert von n eine hinreichende Bedin- 
gung ist, geht schon daraus hervor, dass 



F = C = const 



vermoge P = 0, Q — 0 diese Bedingung befriedigt, wahrend doch 

C = c# = c, 

x' 

von keiner der Formen (la) oder (20) ist und auch das Integral 



*2 

J Cdt = C(t 2 -h) 

tl 

von der Wahl des Parameters abhangig ist. Eine beliebige Function F von 
der in Frage stehenden Eigenschaft muss daher nach Hinzufugung einer be- 
liebigen, nicht verschwindenden Constanten diese Eigenschaft verlieren, wah- 
rend Px' + Qy' = 0 unverandert bestehen bleibt. Thatsachlich driickt die 
Gleichung (15) nur die Bedingung dafiir aus, dass F(x^\ y^), wenn fiir y 
irgend eine Function von x, oder allgemeiner, wenn fiir x, y irgend welche 
Functionen von id eingesetzt werden, „integrabel“ sei, dass namlich: 



16 




1 1 



t2 

, yM'j dt = J F 0 (tf, dt 

tl 



j(d, 



' t 2 

- ti 




Investigations in the calculus of variations 



49 



where the expressions Z V ,T V are formed after z, just as E v , 77^; Ft„, P v are 
after x and y. 

This case has been investigated more thoroughly than necessary for the 
immediate application to the problem of the calculus of variations; the ques- 
tion has been considered a problem of independent interest and does not seem 
to have been completely answered yet. 

As far as I was able to learn, it was Lagrange who posed this question 
for the first time, namely in the form it takes here for (20), in the part of 
his 1806 that is devoted to the calculus of variations. There, he proves the 
necessity of the one equation 



Px' + Qy' = 0 (15) 

(in my terminology), without saying anything about the sufficient conditions. 
Subsequently, Poisson obtains the same result in a similar fashion in his 1823 
but, as he expressly states, takes this one condition to be also sufficient, a 
mistake that Todhunter (1861) duplicates in his review of Poisson’s treatise. 

That (15) is, in fact, not a sufficient condition for any value of n can 
already be seen when we consider that 



F = C = const 



satisfies this condition by dint of P = 0, Q = 0, whereas 



r C C , 
C = —0 = — x 

V 1 X 



is neither of the form (la) nor (20), and also the integral 



*2 

J Cdt = C(t 2 -t i) 

tl 



depends on the choice of the parameter. Adding an arbitrary nonvanishing 
constant to an arbitrary function F of the property under consideration must 
therefore deprive the function of this property, whereas Px' + Qy' = 0 con- 
tinues to hold. In fact, the equation (15) only expresses the condition that 
F(x^\y(^) is 11 integrable" whenever some arbitrary function of x is substi- 
tuted for y or, more generally, some arbitrary functions of 6 for x, y, namely 
that 




^1 




1 1 




dt = 




■ t 2 
- ti 




50 



Zermelo 1894 



nur von den Endwerten: 

nicht aber von dem ganzen Verlauf, von der Form der Function 0 abhangt. 
Die bekannte Eulersche „Integrabilitatsbedingung“ 



<4 O rp 

li — 0 



ist namlich Equivalent der Eigenschaft, dass der Ausdruck: 
r „ dF OF dF / s 

s » F =di M 

seinerseits „integrabel“ ist fiir eine willkiirliche Function = O' t. 

Es ist aber nach (7a) 

n 

S#F = (Px' + Qy') t + dY, {p„(x't)^ ~ V + Q^y'r)^ ~ 1] } 

m = i 

also die Integrabilitatsbedingung, wie behauptet: 

Px' + Qy' = 0 . 



(15) 



So ist z. B.: 

F = xy" — yx" + xyx' = — (xi/ ~ V x ') + X V X ' 

nach t integrabel, sobald fiir y irgencl eine Function von x eingesetzt wird, das 

dy 

Integral aber hangt dann immer noch von x' oder von y' = -p- x ' ab, andert 

ax 

sich also auch mit der Darstellung der Curve, iiber welche die Integration 
erstreckt wird. 

In der That ist hier auch: 



dF d dF d 2 dF 

dx dt dx' dt 2 dx" 



y" + yx' - j t ( x v) - = -xy' 



dF d dF d 2 dF 

dy dt dy' + dt 2 dy" 



—x" + xx' + -r-^x = xx 1 , 
dt 2 



also 



Px' + Qy' = 0 , 



( 15 ) 



wahrend cloch: 



dF , dF , 
dP jX + ~dy" V 



— yx ' + xy' 



nicht identisch verschwindet, (17)2 also nicht erfiillt ist. 




Investigations in the calculus of variations 



51 



only depends on the end values 

but not on the entire course, on the form of the function d. 
For Euler’s well-known “integrability condition” 

n y— i 

E(-‘)“^ssy = » 



fi = 0 



dF 



$$(n) 



is equivalent to the property that the expression 
dF dF 

s * F =aS M+ W w + - 

is, in turn, “integrable” for an arbitrary function Sd = 'Ft. 

But, by (7a), 

n 

5#F = {Px' +Qy') T + DY J {P»{x'T)^- 1) +QM^- X) } , 

A» = l 

and hence the integrability condition, as asserted: 

Px' + Qy' = 0 . 



(15) 



Thus, for instance, 



F = xy" — yx" + xyx' = — {xy' — yx') + xyx' 

at 

is integrable with respect to t, as soon as an arbitrary function of x is sub- 

dy 

stituted for y. But then the integral still depends on x' or on y' = -f-x ' , 

ax 

and hence varies along with the representation of the curve along which the 
integration is carried out. 

In fact, we also have 



P = 



dF 

dx 



d dF 
dt dx' 



n- — - — — 
did y' 



d 2 dF ,, , d , , d 2 

+vx -it ixv) -^< l= - xy 
d 2 dF ,, , d 2 , 



dt 2 dy '• 



= —x 



■ xx 



dt 2 



x = xx 



thus 



Px' + Qy' = 0 , 



( 15 ) 



while 



dF , dF , 

dp' x + w y 



— yx ' + xy' 



does not vanish identically, and hence (17)2 is not satisfied. 




52 



Zermelo 1894 



Das Verfahren, dessen wir uns zur Ableitung der Bedingungsgleichungen 
fur F bedienten, lasst sich leicht auf verwandte Aufgaben iibertragen. 

Eine Function 

{jJL = 0, 1, . . .71 , D = d(t)) 



soil dieselbe Eigenschaft besitzen wie das Integral von F, d. h. sie soil von 
der besonderen Darstellung der Curve x = y = also von 

unabhangig sein und nach Wahl von ip, ijj nur noch von d ab- 
hdngen, so dass ( vergl . (la)): 

<P(£> # V(0),D'VW) = • (21) 

Dann muss <P wie vorher J'(d) der Gleichung (4) geniigen: 

+ < 22 > 

n 

= Y, + y M (yV)W} = Dd> ■ T , 

li — 0 

| wenn jetzt: 

_ d<P _ d$ 

^ dx^C) ’ M dy^C) 

und wieder Sd = d'r gesetzt wird, wo r eine willkiirliche Function von t ist. 

Nun gelten fiir Ssd> dieselben Umformungen wie fiir 8@F, so dass man 
schliesslich mit Anwendung derselben Bezeichnungen die verlangten Bedin- 
gungen in einer der Formen schreiben kann: 



— v + Hv — 0 



(v = l,2,...n) . 



n v + P v — o 

Wenn man in (21) wieder t = <p(d) = x einsetzt, so ergiebt sich: 



wo wie in (20) wieder 



^ , d^y 
= <P [ x, 

dx^ 



d?y _ Sf, (x^^yY 
d x n x' 2 v ~ 1 



(23) 



4. (*«■>, s“) = 1 (x, l.i 0, . . .0; y, | , . . . (24 ) 



gesetzt werden kann. 




Investigations in the calculus of variations 



53 



The procedure we used to derive the constraint equations for F can easily 
be applied to related problems. 

Suppose that a function 

^(x M ,y^ = ${D tl y>(d),D tl ip{d)) 

On = 0,1 ,,..n,d = d(t)) 



has the same property as the integral of F, i. e., it is independent of the 
particular representation of the curve x = p(d), y = ip{d), and hence of 
d' ,d" , . . .d^ n \ and only depends on d, after ip, ip have been chosen, so that 
{of. (la)): 

$(D»p{d),D»ip{d)) =${p^Xd),ip^(d)) . (21) 

Then <P, just like J'(d) above, must satisfy the equation (4) : 

n 

= Y, {*m(*'t) M + Wt) (, °} = D$-t, 

fi — O 

provided that we now set 

_ OF _ 

^ dx&) ’ M i dyM 



and Sd = d'r again, where r is any function of t. 

In this case, the same transformations hold for 5 as for 5$F so that, 
using the same denotations, we can finally write the required conditions in 
one of the following forms: 



— v + H v — 0 

77„ + P v = 0 



(u = 1,2, ...n) . 



If we again substitute t = ip(d) = x into (21), we obtain 

*(*<">,»«)=#(*, 1,0, ...0i . 



= <P ( X 



d^y 

dx v 



(23) 



(24) 



dPy_ _ (a - {u) ,y (v) ) 

dx v x' 2 ^ ~ 1 



where, as in (20), 




54 



Zermelo 1894 



Ebenso kann man auch die Bogenlange einfiihren: 

t = s = j^T^ dt , 



also: 



setzen. Es ist aber: 



<P 



(x^\y^ 



= $ ( d ^_ <^y\ 



I ds v ’ ds v I 



dx 



dy 



— = cosai , — = sinai , 

ds ds 

wo «i den von Tangente und :r-Richtung gebildeten Winkel bezeichnet, und 
es konnen die 



d^x 1 cos«i 

ds v 



d^y d M 1 sinai 

ds v 



ds ^ 1 ’ ds ^ ds ^ ~ 1 

ausgedriickt werden als ganze rationale Functionen von: 



cosai , sin«i , a 2 = 



da i 
ds 



cP 1 ai 
1 



so dass schliesslich: 



<Z> 



= <Pi (x, y, ai, a 2 , . . . a n ) = $ (x, y, a M ) . 



(25) 



Hier ist aber: 



ai = arctg — , 
x' 



a 2 = X ' y " ~ y ' X " 3 (die Krtimmnng) 



(x 12 +y' 2 ) 2 



und allgemein: 



d»- 1 a l d^~ 2 a 2 . 

- - 3^3 (M > !) : 



ds v ' 



ds ' 



( x 12 + y' 2 ) 



wo T ll (xS ,J ' 1 , yW ) eine ganze rationale Function von x’ , x", . . . x ^ ; if , y", . . . 
y ^ bezeichnet. 

Diese Ausdrticke der a M besitzen dieselbe Eigenschaft (21), unabhangig 
von der Differentiationsvariablen zu sein, wie <P selbst. Nach (25) lasst sich 
also jede Function ip von der betrachteten Eigenschaft, jede „Osculations- 
Invariante“ , wie sie aus nachher anzugebendem Grunde genannt werden 
moge, durch die n + 2 besonderen x,y,a i,...a n ausdrlicken, ebenso wie 

nach (24) durch die x,y, Doch hat die Darstellungsweise (25) den we- 
sentlichen Vorzug vor der anderen, dass die a M immer endlich bleiben fiir alle 




Investigations in the calculus of variations 



55 



In the same way, we can introduce the arc length 



and hence set 



But 



<P 



\ ds ^ ’ ds v 






dx 



dy 



— = coscti , — = smai , 

ds ds 

where ai denotes the angle between the tangent and the x direction, and we 
can express the 



d^x d M 1 cosai 

ds ^ ds v ~ 1 

as integral rational functions of 

cosai, sinai, a 2 = 



da± 



d^y d M 1 sinai 

ds ^ ds v- 1 



d M 1 ai 



ds ’ M ds^ ~ 1 

so that finally 

0 (x (At) ,y (M) ) = <d>i (x, y, ai, a 2 , . . . a n ) = <P(x,y,a ll ) . 



(25) 



But here, 



y’ x’y" — y’x" 

ai = arctg — - , a 2 = — 5 - (the curvature) 

x {x' 2 + y' 2 ) 2 



and, generally, 



&L1 



d^a 1 _ d»~ 2 a 2 _ 



ds^ 1 ' 



ds^ 1 ' 



3 fx — 3 

{x' 2 +y' 2 ) 2 



(M > 1) , 



where T^{x^ v \y^) denotes an integral rational function of x' , x", . . . x^; y ' , 
y",...y^\ 

These expressions for the a p possess the same property (21) of being 
independent of the variable of differentiation as <P itself. By (25), we can 
therefore express every function with the property under consideration, 
every “ osculation invariant ”, as we will call it for reasons to be specified later, 
in terms of the n + 2 special x,y,a 1 , ... a n and, likewise, by (24), in terms 

d^y 

of the x, y, — — . But (25) has the essential advantage over the other mode of 




56 



Zermelo 1894 



„regularen“ Stellen der Curven, d. h. solche, wo die x^\y^ endlich sind und 
gleichzeitig x' 2 + y' 2 > 0, die x' und y' nicht beide verschwinden. 

Auch fiir die Function F von der friiher untersuchten Beschaffenheit er- 
giebt sich hieraus eine neue Darstellungsweise. Setzt man namlich: 

20 | F (x^ , (x^ , \J x /2 + y' 2 , 

also fiir x = y = <p(i9), je nachdem man nach t oder •& differentiiert, 

einerseits 



= #(£>^(0), DV'iPi'd)). vV 2 (0) + V>' 2 (0) d' , 

andererseits aber 

so wird nach: 

F(l> # V(0),£>'VW) = F & , (la) 

jetzt: 

#(£> # V(0),£>'VW) = $(‘P M ('&),'ip W ('&)) , 

d? ist also eine „Osculations-Invariante“ und daher nach (25) 

F (x^ , y ( ^ = <P(x, y, a^)^/ x ’ 2 +y ’ 2 (26) 

F dt = <P(x, y, a M ) ds . 

Der fiir Functionen der hier betrachteten Art eingefiihrte Ausdruck griin- 
det sich auf ihre Bedeutung fiir die Beruhrung oder „Osculation u zweier Cur- 
ven , iiber welche sich hier einige Bemerkungen anschliessen mogen, die in den 
folgenden Untersuchungen vielfache Anwendung hnden werden. 

Wenn zwei Curven x = <p(u), y = ip(u)', x = y>(v), y = ip{v) i n den 
Punkten u = uq, v = vo, in denen sie sich regular verhalten, wo also auch 
<p ,2 (u) + i/j ,2 (u) und Tp ,2 (v) +if> (v) > 0 sind, gemeinsame Werte besitzen fiir 
x,y,a i , « 2 ) • • • OL m und damit fiir alle Osculations-Invarianten <P(x^ ,y^) bis 
zur mten Ordnung (/u. = 0,1,... m), so sagt man, sie beruhren einander von 
inter Ordnung in dem Punkte: 

x 0 = ip(u 0 ) = <p(v 0 ) , y 0 = ip{u 0 ) = ip{v o) ■ 

21 | Genauer ware dieses Verhaltnis auszudriicken: sie beriihren einander „von 

mindestens m ter Ordnung 11 , noch genauer „von einer Ordnung > m — 1“, 
im Sinne der strengeren von Herrn Weierstrass in seinen Vorlesungen gege- 
benen Definition, der zufolge die wahre Ordnungszahl der Beriihrung auch 




Investigations in the calculus of variations 57 

representation that the always remain finite for all “regular” positions of 
the curves, i. e., those where the x^\y^ are finite and, at the same time, 
x' 2 + y' 2 > 0, the x' and y' do not both vanish. 

This also yields a new mode of representation for the function F of the 
constitution investigated earlier. For if we set 

F ( x ^ = d> (x ^ , y^'j \J x' 2 + y' 2 , 

and hence, for x = <p(i9), y = </?(■$), depending on whether we differentiate 
with respect to t or d, on the one hand 

= -\V 2 W + & , 

but on the other hand 

F = d? \/ <p' 2 (d) + if' 2 0 ?) > 

then, by 

F(£) M <p(i?), D^ipid)) = F , (la) 

now 

and hence d> is an “osculation invariant”, and therefore, by (25), 

F (x^\ =$(x,y,a fi )y/x' 2 +y' 2 (26) 

F dt = d>(x, y, aifj,) ds . 

The expression introduced here for functions of the kind under consid- 
eration is based on their significance for the contact or “ osculation ” of two 
curves, about which a few comments are in order here which will be invoked 
many times in the investigations to follow. 

If two curves x = tp(u), y = x — tp{v), y = ip(v) have common values 
for x, y, ai, « 2 5 • • • OL m , and hence for all osculation invariants d>(x^\ y^) up 

to the mth order (p = 0, 1, . . . m), at the points u = Uq, v = Vq where they 

,2 

are regular, and hence where also <p' 2 (u) + i/j' 2 (u) and p' 2 (v) + ip {v) >0, 
then they are said to have contact of the m th order at the point 



x 0 = (p(u 0 ) = <p(v 0 ) , yo = 4>{uo) = ip{vo) ■ 



It would be more precise to express this relationship as follows: they have 
contact “of at least m th order”, or even more precisely, “of an order > m — 1”, 
in the sense of the more rigorous definition given by Mr. Weierstrass in his 




58 



Zermelo 1894 



gebrochen sein kann. Doch will ich mich hier immer der abgekiirzten Aus- 
drucksweise bedienen. 

Damit der betrachtete Umstand eintritt, miissen sich, wie sich streng be- 
weisen liesse, fur geeignete unabhangige Variable die Ableitungen der Coor- 
dinaten bis zur to ten Ordnung zur Ubereinstimmung bringen lassen, d. h. es 
miissen sich u und v so als Functionen eines beliebigen Parameters t darstellen 
lassen, dass fiir: 



t — to , 



n „ ePu 
D^u = — — = u 
dt ^ 



(/d 
0 > 



die Beziehungen bestehen: 



n „ d^v 
D^v = — — = v 
dt ^ 



(/d 

o 



D^<p(u) = D^kpiy) , D^ip(u) = D^ip(v) (27) 

(i 1 = 0, 1, 2, ... to) 



oder, was nach (2) auf dasselbe hinauskommt, dass die Gleichungen: 
Ru, (<P^\uo),U { o ) 'j = R u, {rp {v) {vo),v { ^ 

Rn (V^o),?^) = Ru o),t’o' y) ) 

(m = 0,1,2 



durch geeignete Werte der ,v < o' > (p = 0, 1, . . .to) befriedigt werden kon- 
nen. 

1st dies der Fall, so ist auch in der That an der betrachteten Stelle 
nach (21): 

<P (u) , (u)^ = (28) 

= ^(P' I ^(r),i} M i/)(r)) = <S> (tp^\v),ip^\v)j . 

Hierbei ist jedoch fiir to ^ 1, d. h. wenn eine wirkliche Beruhrung statt- 
finden soil, immer noch u r 0 .^0 vorauszusetzen, weil wegen 

22 | tp'(u)u' = Tp'(y)v' , ip'(u)u' = %p\v)v' 

auch 

\J <p' 2 {u) + ip' 2 (u) u = ± \jtp' 2 (y) + ip' 2 (v) v' 



v/_ _ + \j v' 2 ( v ) + 4 , ' 2 ( v ) 
v' \Jip' 2 (u) + 1 p ,2 (u) 



sein und daher 




Investigations in the calculus of variations 



59 



lectures according to which the true order number of the contact may also 
be rational. But I shall always use the shorthand form here. 

For the situation under consideration to arise, we must, as could be proved 
rigorously, be able to bring into agreement the derivatives of the coordinates 
up to the m th order for suitable independent variable, i. e., it must be possible 
to represent u and v as functions of an arbitrary parameter t so that for 



t — t 0 , 
the relations 



n u d»u 
D^u = — — = u 
dt p 



(m) 

0 I 



n „ d^v 
D^v = — — = v 
dt ^ 



(m) 

o 



D At <p('u) = D**<p(v) , D^ip{u) = D^ip{v) (27) 

= 0,1,2,... m) 

hold, or, what, by (2), amounts to the same, that the equations 
Rfj, (v (u) (u 0 ),u i 0 l ' ) ') = Rfj, 

Rn (V’ (l/) (wo),Mo I/) ) = Rfj. [^ V) (vo),Vo )S ) 

(v = 0, 1, 2, . . . m) 

can be satisfied by suitable values of the , Vq 1 ’ 1 (/j, = 0, 1, . . . m). 2 

If this is the case, then, at the position under consideration, we indeed 
have, by (21), 

<f(y w («),^ (,i) ()i)) =$(D lt tp(u),Di i il>(u)) (28) 

= d>(D^{v), D^ip(v)) = d> (tp^\v), . 

In this case, however, we still have to assume for m ^ 1, i. e., if a genuine 
contact is to occur, that u' 0 ,v'o ^ 0, for on account of 

<p'(u)u' = Tp'[y)v' , ^ {u)v! = (v)v' 

we also must have 

sjtp’ 2 {u) + ip' 2 (u) u! = ± \j Tp' 2 ( v ) + 4>' 2 (v) v' , 

and hence 

v/_ _ + yV 2 (^) + y> /2 (u) 

v' ^(fi' 2 (u) + 1p' 2 i u ) 



2 JZermelo erroneously writes “v^” instead of “Vq^”.] 




60 



Zermelo 1894 



einen bestimmten von 0 und oo verschiedenen Wert annehmen muss. Auch 
ist dabei zu unterscheiden zwischen „ gleichgerichteter Beriihrung“ , bei wel- 
cher u' und v ' , mithin auch <p'(u) und Tp’ (v), ’ip'iu) und ip ( v ) gleiches Vorzei- 
chen haben, von der „entgegengesetzten“ Beriilirung, wo alle diese Grossen- 
Paare entgegengesetzte Vorzeichen besitzen; im ersten Falle werden die Coor- 
dinaten der beiden Curven in der Nahe des Beriihrungspunktes in gleichem, 
im anderen Falle in entgegengesetztem Sinne sich andern, wenn jede Curve 
im Sinne der wachsenden Variablen u oder v beschrieben gedacht wird. Bei 
den spateren Anwendungen wird es immer so eingerichtet werden, dass nur 
„gleichgerichtete“ Beriihrungen in Betracht kommen. 

Da die Variable t in der zuerst gegebenen Definition der Beriilirung selbst 
keine Rolle spielt, so miissen auch die Bedingungen (27) bestehen bleiben, 
wenn man darin diese Variable durch eine beliebige Function derselben er- 
setzt. 

Sei etwa: 



D%v{u) = D%ip(v) , D^ip(u) = D%ip(v) (27) 

= 0 , 1,2 D»=^j, 

so wird mit Hilfe von (2) in der That auch: 

D^(u) = R, (d^{u)M v) ) = R, (D$p(v),#W) = D?<p(v) 
D?il>{u) = R, (d^{u),^ v) ) = R, (D^(u),i?W) = D^(v) 

(n = 0,l,...m) , 



wo die = — — an der betrachteten Stelle bis auf d ^ 0 willkiirlich 

dt ^ 

gewahlt sein konnen. Nun kann man die Variable t immer so wahlen, dass u 
eine beliebige Function von t wird, also beliebig vorgeschriebene Werte der 
d^u 

im Beriihrungspunkte annimmt; dann aber sind die anderen 

Grossen v ^ durch die Gleichungen (27) vollstandig bestimmt. Diese 

lassen sich namlich nach (2) in der Form schreiben: 

R , 4 ( ),v^ ) ) = D^p(u) 

(/z = 0, l,...m) (29) 

R y, ^ (ly) (u),U (l/) j = D^1p{u) 

und gestatten eine successive eindeutige Auflosung nach v', v", . . . v durch 
Benutzung der oberen oder der unteren Gleichungen, je nachdem <p'(v) oder 
V’ (u) von Null verschieden ist. 




Investigations in the calculus of variations 



61 



must take a particular value different from 0 and oo. Also we have to distin- 
guish here the “contact of equal direction ”, where v! and v', and hence also 
and Tp’(v), if'( u ) and if (v) have the same sign, from the contact “of 
opposite direction ”, where all these pairs of quantities have opposite signs; in 
the first case, the coordinates of the two curves vary in the same sense near 
the point of contact, and in the other case, they vary in the opposite sense, 
assuming that we describe the curve in the sense of the increasing variables u 
or v. In later applications, we will always set up matters so that we only have 
to consider contacts “of equal direction”. 

Since the variable t is itself irrelevant in the first definition of contact, the 
conditions (27) must continue to hold when this variable is replaced by an 
arbitrary function. 

For instance, let 

D#<p(u) = D^Tp(v) , D%if(u) = D%ip(v) (27) 

(h = 0, 1, 2, . . . m; D% = , 



then, by means of (2), we indeed also have 

D?<p{u) = R, (r>^(«),0M) = R* (d^{v)M v] ) = Dftiv) 

D?if(u) = R, = R, (pmv),^) = D^(v) 

(/z = 0, 1 , ...m) , 



d^ d 

where the can be chosen arbitrarily at the position under con- 

sideration but for {)’ ^ 0. Now we can always choose the variable t so that u 
becomes an arbitrary function of t, and hence assumes arbitrarily prescribed 

d^ Uy 

values of the at the point of contact; but the other quantities 

d^ u 

n W are then completely determined by the equations (27). For, by (2), 

they can be written in the form 



Rfj, v ),v < - v ' ) ) = D^<p{u) 
Rfi = D^if(u) 



(M = 0,1,... m) 



(29) 



and allow for a successive unique solution for v',v",...v^ by use of the 
upper or lower equations, depending on whether Tp' (y) or if (v) is different 
from zero. 




62 



Zermelo 1894 



So kann man auch t = u, = e M> i (/i = 1,2,... m) vorschreiben und 
erhalt dann die Bedingungen der Osculation in der Form: 

= D^ip(v) = Rfj, 

_ 7 ( 30 ) 

(/z = 0, 1,2, ...m) , 

wo die u, v l ' J1> flir den Beriihrungspunkt zu nehmen sind und wo im Falle 
„gleichgerichteter“ Beriihrung immer v 1 > 0 angenommen werden muss. 



Zweiter Abschnitt. 

Definition und erste notwendige Bedingungen 
des Minimums. 

Indent wir ein Maximum oder Minimum eines Integrates 



*2 




1 1 



suchen, konnen wir durch die Substitution F\ \ — F den Fall des Maximums 
auf den des Minimums zuriickfiihren, brauchen uns also ohne Beschrankung 
der Allgemeinheit nur mit dem letzteren zu beschaftigen. Unsere Aufgabe ist 
demnach die folgende: 

Ist 

F (xM , yW) =f(x,x',... ; y, y ', . . . y {n) ) 

eine vorgeschriebene analytische Function ihrer samtlichen Argumente, die 
im ganzen betrachteten Bereiche den Charakter einer ganzen Function be- 
sitzt und den im ersten Abschnitte entwickelten Integrabilitatsbedingungen 
geniigt, so suchen wir unter der Gesamtheit A aller Curven 

x = ip(t) , y = ip(t) , 

die gewissen vorgeschriebenen Bedingungen geniigen, eine solche besondere 
Curve a, welche einen kleineren Wert des auf der Curve zwischen bestimmten 
Grenzen erstreckten Integrales 








Investigations in the calculus of variations 



63 



Thus we can also set t = u, u <JJ ^ = e /t .i (fi = 1,2, . . .to), thereby obtain- 
ing the conditions of the osculation in the form 

\u ) = D^(v) = 

_ 7 ( 30 ) 

(m = 0,1, 2,... to) , 

where we must take the u, for the point of contact and always assume 
v’ > 0 in the case of contact “of equal direction”. 



Second section. 

Definition and first necessary conditions 
of the minimum. 



When seeking to determine the maximum or minimum of an integral 



*2 




1 1 



we may reduce the case of the maximum to that of the minimum by the 
substitution F\\ — F, and hence only need to consider the latter without loss 
of generality. Our task is therefore as follows: 

Suppose that 

F (x^Xy^ = F (x,x',...x (n) ;y,y\ ...y {n) ^j 

is a prescribed analytic function of all of its arguments that in the whole 
domain under consideration possesses the character of an entire function and 
satisfies the integrability conditions set out in the first section. Then among 
the totality A of all curves 



x = ip(t) , y = if>(t) 

satisfying certain prescribed conditions we seek to determine a special curve a 
furnishing a value for the integral 



J = 




1 1 



dt. 



x (n) = 



d»x 

dt v ’ 



y M = 



dAy\ 

dt v ) 




64 



Zermelo 1894 



25 liefert als alle benachbarten Curven a derselben Gesamtheit A, | so dass im- 
mer: 

AJ = J(a) - J(a) > 0 (31) 

wird. 

Die unsere Gesamtheit A definierenden Bedingungen mogen sich fur die 
nachstehenden Untersuchungen auf die folgenden beschranken. 

1. Grenzbedingungen. Alle Curven A sollen an beiden Grenzen der In- 
tegration t = t 1 und t = t 2 in gegebenen Punkten 1 und 2 zwei gegebene 
analytische Curven d\ und d 2 

x = ki(t) , y = h(t ); x = k 2 (t) , y = l 2 (t) 

von n — Iter Ordnung beruhren, so dass nach (30) die Gleichungen: 

D'VO? i) = = k[^ , D^(tfi) = = l { i ] 

2 ) = k^ (t' 2 ) = k^ , D^{d 2 ) = l M (t' 2 ) = l :<**> 

(M = 0, l,2,...n-l) 

durch passende Werte der = tf^Vi) und ^ = ^\t' 2 ) (i?i = t u 
d 2 = t 2 ) an den Stellen t = t\ und t = t 2 befriedigt werden konnen; oder, was 
dasselbe ist, alle diese Curven sollen an den Grenzen vorgeschriebene Werte 
der Osculat.ions-Invarianten x, y 1 ai, . . . a n - 1 besitzen. 

2. Stetigkeitsbedingungen. 

a) Damit die iiber die Curven A erstreckten Integrate immer einen 
bestimmten Sinn haben, wollen wir vorlaufig voraussetzen, dass sich 
das ganze Intervall t\ . . . t 2 jedesmal in eine endliche Anzahl von Tei- 
len zerlegen lasst, in deren jedem 

x^=ip^\t), y^=i/j^(t) (y = 0, 1, . . .n) (33) 

eindeutige und stetige Functionen von t sind. Allerdings ist die Con- 
vergenz der Integrale auch unter allgemeineren Voraussetzungen mog- 
lich, worauf aber hier zur Vereinfachung der Untersuchung keine Riick- 
sicht genommen werden soil. 

26 | /3) Ausserdem sollen die Functionen <p(£) und ip(t) der Curven A 

mit ihren r ersten Ableitungen im ganzen Integrations-Intervall (34) 
ausnahmslos stetig verlaufen, wo r ^ 0 vorlaufig unbestimmt bleibt, 
und fiir r Si 1 der Bedingung geniigen: 

g>'\t)+V 2 (t)Z 7 2 >0, (35) 

sodass „singulare Punkte“ (<p'(t) = 0 ,ip'(t) = 0) ausgeschlossen sind. 
Es wird sich namlich (vergl. Satz IV) zeigen, dass ohne eine solche 
Voraussetzung ein Minimum iiberhaupt unmoglich ware. 




Investigations in the calculus of variations 



65 



taken along the curve between certain limits that is smaller than that of all 
neighboring curves a of the same totality A so that we always have 

AJ = J(a) - J{a) > 0 . (31) 

In the subsequent investigations, we will restrict the conditions defining 
our totality A as follows: 

1. Limit conditions. All curves A shall have contact of n— 1 t.h order with 
two given analytic curves g?i and d 2 

x = ki(t), y = h{t); x = k 2 (t) , y = l 2 (t) 

at both limits of the integration t = t\ and t = t 2 at given points 1 and 2 so 
that, by (30), the equations 

(tfr) = k[^ (f'i) = k^ , £>^(0i ) = = l [ m) 

£>M02) = k^\t' 2 ) = k M , D^(d 2 ) = l^\t' 2 ) = 

(m = 0, 1, 2, . . . n — 1) 

can be satisfied by appropriate values of = d^ftf) and = d^ 2 \ 1 2 ) 
(t?i = ti, d 2 = t 2 ) at the positions t — and t = t' 2 , or, what amounts 
to the same, all these curves shall possess prescribed values of the osculation 
invariants x, y, oq, . . . a n _ 1 at the limits. 

2. Continuity conditions. 

a) In order to ensure that the integrals taken along the curves A 
always have a definite meaning, we shall, for the present, assume that 
the entire interval t\ . . . t 2 is always capable of decomposition into a 
finite number of parts in each of which 

(f) , y(^=ip^\t) (p = 0, 1, . . .n) (33) 

are single-valued and continuous functions of t. But in order to sim- 
plify our present investigation we shall not consider the possibility of 
the convergence of the integrals also under more general assumptions. 

(3) Furthermore, the functions ip(t) and ijj(t) of the curves A, to- 
gether with their first r derivatives, shall be continuous on the (34) 
entire interval of integration without exception, where r ^ 0 remains 
indefinite for the present, and, for r ^ 1, satisfy the equation 

p' 2 (t)+^ 2 (t)A 7 2 >0, (35) 

so that “singular points” (p>'(t) = 0 = 0) are excluded. For we 
shall see (cf. Theorem IV) that a minimum would be entirely impos- 
sible without an assumption of this kind. 




66 



Zermelo 1894 



Streng genommen brauchten die (y ^ r) nicht selbst immer 

stetig zu bleiben, sondern nur die „Osculations-Invarianten“ & (a ;W,j/W) bis 
zur rten Ordnung, also namentlich die x,y,a i, . . . a r , die nach (25) alle iib- 
rigen bestimmen. Denn durch Veranderung der Differentiations-Variablen, 
oder, was dasselbe ist, der „Darstellung“ der Curve fiir einzelne Teile des 
Intervalls, wodurch etwa die in iiberge- 

hen, wiirde die Stetigkeit der x^\y^\ wenn sie einmal bestande, beliebig 
aufgehoben und wieder eingefiihrt werden konnen. Es geniigte also, die Ste- 
tigkeit der fiir die einzelnen Teile der Curve vorauszusetzen, 

wenn nur ausserdem an den Ubergangsstellen immer Beriihrungen r ter Ord- 
nung angenommen werden. Dieser Fall aber wird sich durch eine geeignete 
Darstellung der Curve immer vermeiden lassen, und so soil denn auch den 
folgenden Untersuchungen die Bedingung (34) in voller Strenge zu Grunde 
gelegt werden. 

Alle diesen Bedingungen (32)-(35) geniigenden Curven A wollen wir als 
„erlaubte Curven 1 ' 1 , den Ubergang von einer zur anderen als eine „erlaubte 
Variation 11 bezeichnen. Anderweitige Beschrankungen der Curven A sollen 
hier nicht betrachtet, iiberall soil „freie Variation 11 vorausgesetzt werden. 

Einer sorgfaltigen Untersuchung bedarf jetzt noch der Begriff der „ benach- 
barten u Curven (a), dessen Auffassung fiir das ganze Problem von entschei- 
dender Bedeutung ist. 

Herr Prof. Weierstrass betrachtet als zu a „benachbarte“ Curven a alle 
diejenigen, welche ganz innerhalb eines gewissen, durch eine Begrenzung C 
27 eingeschlossenen Flachenstreifens ver- 1 laufen, wenn C das Curvenstiick a voll- 
standig umgiebt und ihm nirgend unendlich nahe kommt. Ist S der kiirzeste 
Abstand zwischen C und a, so liegen im Innern des Streifens alle Punkte, 
die von a Entfernungen < S besitzen, und umgekehrt wird eine zweite Be- 
grenzungscurve C derselben Art, fiir welche 5 den grossten Abstand von a 
darstellt, nur solche Punkte einschliessen, die a naher kommen als auf die 
Entfernung 6. Dieses zweite, durch C' eingeschlossene Gebiet wird aber fiir 
die Frage nach dem Minimum dieselbe Rolle spielen wie das grossere C, so 
dass man auch alle die Curven A als „benachbarte“ betrachten kann, deren 




Fig. 1. 



samtliche Punkte von a Abstande < 6 besitzen, oder, analytisch ausgedriickt, 
unter den A alle solchen Curven 



x = A) , y = ip{ A) , 




Investigations in the calculus of variations 



67 



Strictly speaking, the x^\ (/z ^ r) would not always have to be con- 

tinuous themselves, but only the “osculation invariants” (a;^, y^) up to 
the r th order, and hence in particular x, y, aq, . . . a r , which, by (25), deter- 
mine all the others. For by changing the variable of differentiation, or, what 
amounts to the same, the “representation” of the curve for individual parts of 
the interval, whereby, say, the are transformed into 

’(i9), we would be able to remove and reintroduce at will the continuity of 
the x^\y ^ once it obtains. It is therefore sufficient to assume the continu- 
ity of the for the individual parts of the curve, provided only 

that we also assume contacts of the r th order at the corners. But a suitable 
representation of the curve always helps avoid this scenario. Therefore, the 
rigorous application of the condition (34) shall underly the investigations to 
follow. 

All curves A satisfying the conditions (32)-(35) will be called “ admissible 
curves ”, and the transition from one to the other an “admissible variation”. 
We will not consider any further restrictions on the curves A and always 
assume “free variation”. 

The concept of “ neighboring 1 ' curve (a) now still requires thorough inves- 
tigation on account of its crucial significance for the entire problem. 

Prof. Weierstrass considers all those curves a as curves “neighboring” a 
that run entirely within a certain strip of the plane enclosed by a boundary 
C, if C completely surrounds the curve segment a without getting infinitely 
close to it at any point. If <5 is the shortest distance between C and a, then 
all those points lie in the interior of the strip whose distance from a is < <5, 
and, conversely, a second boundary curve C' of the same kind for which 8 
is the greatest distance from a encloses only points that get closer to a than 
the distance 6. But this second region, which is enclosed by C' , will play the 
same role in the question of the minimum as the greater one, C , so that we 
can consider all those curves A “neighboring” curves each of whose points lie 




Fig. 1. 



at a distance < 8 from a, or, expressed in analytic terms, among the A all 
such curves 



x = y>{\) , y = ip( A) , 




68 



Zermelo 1894 



fur welche zu jedem Werte A zwischen Ai und A 2 stets ein t = x zwischen 1 1 
und f 2 so bestimmt werden kann, dass 

\/ (^(A) - <p(x)) 2 + W>(A) - ip{x)) 2 < S , 

also auch 

I^PO - p(x)\ < g , - ip{x)\ < g 

sicher fiir g 6. Bestehen aber umgekehrt diese letzten Ungleichheiten 

£ 

fiir g ^ — =, so folgt daraus wieder die vorhergehende. 

V 2 _ 

Ein Minimum wird daher fiir a dann und nur dann stattfinden, wenn alle 
diejenigen erlaubten Curven ein grosseres Intervall als a liefern, welche fiir 
irgend ein constantes positives g den beiden Ungleichheiten geniigen. 

Die so entwickelte Definition ist aber fiir unseren allgemeineren Fall nicht 
28 ausreichend, da sie die Existenz eines Mini- | mums allzusehr beschranken, ja 
vielleicht iiberhaupt unmoglich machen wiirde. Wir miissen vielmehr voraus- 
setzen, dass wenigstens in den Fallen n > 1 nicht nur die Punkt.e der Curven a 
solchen von a hinreichend nahe kommen, sondern auch die Tangentenrichtun- 
gen, die Kriimmungen u. s. w., allgemein die Osculations-Invarianten bis zu 
einer gewissen Ordnung m. Es sollen also als „benachbart“ angesehen werden 
alle solchen Curven x = ^(A), y = fiir welche zu jedem A des ganzen 

Intervalls iininer ein x zwischen t\ und 1 2 so bestimmt werden kann, dass 
eine Anzahl Ungleichheitsbedingungen befriedigt werden von der Form: 

<P < S , ( 36 ) 

wo 

eine Function von der Eigenschaft (21) und S eine von A unabhangige Grosse 
ist. 

Diese Bedingungen lassen sich wieder auf die zwischen den Coordinaten 
selbst und ihren Ableitungen bestehenden Ungleichheiten zuriickfiihren: 

D^X) - (p^\x) Kg^, D^tjj(X) - ^\x) < g^ ( 37 ) 

{y = 0,1,2 ,...m), 

wo fiir ein jedes A ausser x noch die Grossen A' > 0, A", . . . die Werte 
der ersten m Ableitungen einer Function A(t) = A (f; A) fiir t = x, beliebig 
angenommen werden konnen. Dabei brauchen die zu verschiedenen A geho- 
rigen x. von vornherein in keinen Beziehungen zu einander zu stehen, 
wahrend die positiven Grossen g, gi, ■ ■ ■ g m von A unabhangige bestimmte 
Werte haben miissen. 

Die Willkiirlichkeit der Parameterdarstellung kommt fiir die Curven Tp, if) 
hier nicht in Betracht, da jede Veranderung der Darstellung offenbar nur die 




Investigations in the calculus of variations 



69 



are considered “neighboring” curves for which it is always possible to deter- 
mine for every value A between Ai and A 2 a 1 = x between t\ and so 
that 

\! (^(A) - y(x)) 2 + OKA) - < <5 , 

and hence also 

\Tp(\) - ip(x)\ < g , \ip(X) - ip(x)\ < g 

certainly for g ^ S. But if, conversely, these latter inequalities hold for g ^ 
, then, in turn, the former follows from this. 

Hence a has a minimum if and only if all those admissible curves fur- 
nish a greater interval than a that satisfy both inequalities for some positive 
constant g. 

But the definition thus developed is not sufficient for our more general 
case as it would greatly restrict the existence of a minimum, and perhaps even 
make it impossible. Rather, we have to assume that at least in the cases n > 1 
not only the points of the curves a get sufficiently close to those of a but also 
the directions of the tangents, the curvatures e. t. c., generally, the osculation 
invariants up to a certain order m. Hence all those curves x = Tp{ A), y = f/>(A) 
are to be considered “neighboring” curves for which it is always possible to 
determine for every A of the entire interval a x between t\ and £2 so that 
several inequality conditions of the form 

<P — <P <6, (36) 

are satisfied, where 

= $(x,x',...x ("*> 

is a function with the property (21) and S a quantity independent of A. 

These conditions can again be reduced to the inequalities obtaining be- 
tween the coordinates themselves and their derivatives: 

D^( A) - ^ (x) | < 9li , | D^( A) - ^ (x) | < g „ (37) 

(/x = 0, 1, 2, . . . m), 

where for each A but x we may assume the quantities A' > 0, A", . . . A^ m \ 
the values of the first m derivatives of a function A (t) = A (t; A) for t = x, 
arbitrarily. In this case, the x, belonging to different A do not need to 
stand in any relation to one another from the outset, whereas the positive 
quantities g, g\, . . . g m must have values independent of A. 

The arbitrariness of the parametric representation is out of the question 
here as far as the curves Tp , are concerned since every change in the rep- 



S 

71 




70 



Zermelo 1894 



A', A", . . . A (m ) andern wiirde. Dagegen muss immer eine bestimmte Darstel- 
lung x = y = il >(t) von a zu Grunde gelegt werden, ihre Veranderung 
miisste unter Umstanden eine Veranderung der g p zur Folge haben, wenn die 
Beziehungen (37) fortbestehen sollen. 

29 | Wie leicht zu zeigen ist, lassen sich die immer so klein angeben, dass 

fiir alle den Bedingungen (37) geniigenden Curven a auch Bedingungen der 
Form (36) in beliebiger Anzahl und fiir beliebig kleine 8 befriedigt werden 
konnen, und umgekehrt giebt es immer eine Anzahl von Ungleichheiten (36) 
mit so kleinen Werten der <5, dass auch die (37) gelten miissen fiir passen- 
de x, \ <IJ> und beliebig klein vorgeschriebene g Doch wird dabei vorausge- 
setzt, dass jedes ip^ft)) auf dem ganzen Curvenstiicke a iiberall 

endlich bleibt. 

Von a soil also behauptet werden, es lief ere ein Minimum in einer „Nach- 
barschaft mter Ordnung“, falls sich fiir irgend eine Darstellung: x = <p(t), 
y = if(t) von a positive Grossen g, g\, . . . g m so angeben lassen, dass alle 
„erlaubten“ Curven a 

x = ^(A), y = V>( A) , 

die fiir alle in Betracht kommenden A und fiir jedesmal passend bestimm- 
te x, A^ den Bedingungen (37) geniigen, Integrate liefern 

J(a) > J(a). 

Die „Ordnung der Nachbarschaft“ m wird spater auf die Falle m = 
n — 1 und m = n beschrankt werden, welche allein fiir unsere Untersuchung 
Interesse bieten. 

Man erkennt, dass die hier gegebene Definition der „benachbarten Curven“ 
sich ebenso wie die Weierstrass’sche nur auf ihr Verhalten in jedem einzelnen 
Punkte bezieht, nicht aber auf ihren allgemeinen Verlauf oder eine specielle 
Darstelhmgsform. 

Um die Analogie noch deutlicher hervorzuheben, kann man die Gesamt- 
heit aller Eigenschaften, die eine Curve in einem bestimmten Punkte mit 
alien sie von m ter Ordnung beriihrenden Curven gemeinsam hat, also die Ge- 
samtheit der Osculations-Invarianten <!> (x^\ y bis zur to ten Ordnung, in 
einen einzigen Begriff, den eines „Curvenelementes mter Ordnung “, zusam- 
menfassen, der sich fiir to = 0 auf den eines einfachen Punktes reduciert und 
somit als dessen Erweiterung aufgefasst werden kann. Bestimmt wird ein sol- 

dy d m y 

ches „Element“ nach (24) und (25) durch die Werte der x, y, — , . . . — — — oder 
” \ ) k ) ,y, dx’ dx m 

30 der x, y, ai, . . . a m , also durch | m + 2 unabhangige Variable, durch die x^\ 

y^ (p = 0, 1, . . .to) selbst aber nur insofern, als dabei von alien nur durch 
die willkiirliche Wahl der Variablen t entstehenden Verschiedenheiten abgese- 
hen werden muss. Es miissen namlich zwei Elemente ($) ; (^)) und 

(D^tpfd), D^%f{ r d)) m fiir beliebige Werte der d' , . . . als identisch betrach- 

tet werden, da sie nach (21) in Bezug auf samtliche Osculations-Invarianten 
bis zur Ordnung to iibereinstimmen. Somit haben zwei Curven x = <p(t), 




Investigations in the calculus of variations 



71 



resentation would obviously only change the A', A", . . . A^ m ). By contrast, a 
particular representation x = ip(t), y = ip{t) of a must always be taken for 
granted, and its change would possibly have to lead to a change of the g^, 
assuming the relations (37) are to continue to obtain. 

As is readily shown, it is always possible to choose sufficiently small g ^ 
so that for all curves a satisfying the conditions (37) an arbitrary number 
of conditions of the form (36) for arbitrarily small 5 can be satisfied as 
well. Conversely, there is always a number of inequalities (36) whose val- 
ues of the S are so small that the (37), too, must hold for appropriate x, A^ 
and arbitrarily small prescribed g )l . But it is taken for granted here that 
every $ (<pW(t),^;W(t)) remains finite everywhere on the entire curve seg- 
ment a. 

What we wish to assert, of a is that it furnishes a minimum in a “ neigh- 
borhood of m th order”, if, for some representation: x = ip(t ), y = ip(t) 
of a, positive quantities g, g\, ■ ■ ■ g m can be specified so that all “ admissible ” 
curves a 

x = <p( A), y = ip{ A) , 

that satisfy the conditions (37) for all relevant A and for always suitably 
determined x, A ^ furnish integrals 

J(a) > J{a). 



Later, we will restrict the “order of the neighborhood” m to the cases 
m = n— 1 and m = n, which are the only ones of interest in our investigations. 

We can see that, like Weierstrass’s, the definition of the “neighboring 
curves” given here only refers to their behavior at each individual point but 
not to their general course or a special form of representation. 

In order to bring out the analogy even more clearly we may capture the 
totality of all properties shared by some curve at a certain point with all curves 
having m th order contact with it, that is, the totality of osculation invariants 
&(xM,yM) up to the mth order, in a single concept, that of a “ curve 
element ofm. th order ”, which is reduced to that of a simple point when m = 0, 



and hence may be considered as its extension. According to (24) and (25), 

dy d m y 

such an “element” is determined by the values of the x.y, . — — — or of 

dx dx m 

the x, y, ai, . . . a m , and hence by m + 2 independent variables, by the x^\ 
t/C (fi = 0, 1, . . . m) themselves but only insofar as differences arising solely 



from the arbitrary choice of the variables t have to be left out of consideration. 
For two elements (id), ip^ (i?)) m and D ,J ''i/j('d)) rn for arbitrary 

values of the & , must be considered identical, since, by (21), they 



agree with respect to all osculation invariants up to order m. Hence, two 




72 



Zermelo 1894 



y = ip(t) und x = tp{t), y = ip{t) dann und nur dann ein „Element m ter 
Ordnung“ gemeinsam, wenn sie eine Beruhrung to ter Ordnung mit einander 
eingehen, wo nach (27) 

D^ip(u) = D^Tp{v) , = D^ip^v) (/x = 0, 1 , to) . 



Nun definiert (36) und in anderer Form (37) fiir jeden Wert von x ei- 
ne gewisse „Umgebung u des Elementes (x) , ip^ (x)) m und daher fiir 

variable x zwischen t\ und 1 2 eine „Nachbarschaft m ter Ordnung “ des Cur- 
venstrickes a. Solche „Umgebungen“ und solche „Nachbarschaften“ sind, wie 
leicht zu zeigen ist, continuierliche Bereiche in der to + 2 fachen Mannigfal- 
tigkeit der Curvenelemente, namlich Gesamtheiten von der Beschaffenheit, 
dass jedes ihrer im Inneren gelegenen Elemente stets in eine solche (nicht un- 
endlich kleine) Umgebung eingeschlossen werden kann, welche selbst wieder 
ganz der Gesamtheit angehort. Eine nach den Definitionen (36) und (37) zu a 
„benachbarte Curve 11 a ist also eine solche, deren samtliche Elemente einem 
solchen Nachbarschaftsbereiche von a angehoren, jedes namlich immer einer 
gewissen „Umgebung“ eines Elementes von a. 

„Begrenzt“ werden solche „Bereiche“ oder „Gebiete m ter Ordnung 11 nach 
(36) mit Hilfe von (24) und (25) durch Gebilde von hochstens to. ter Ordnung, 
namlich durch Gleichungen der Form: 



<P 




( d^y \ 



also durch Differentialgleichungen m ter Ordnung, deren Integrale in Gestalt 
von to fach unendlichen Curvenscharen eine geometrische Veranschaulichung 
31 der „Gebiete“ ermoglichen und der im Anfang | erwahnten „Begrenzungs- 
curve 11 C eines Flachenstreifens entsprechen. Doch soli auf diese Betrachtun- 
gen vorlaufig nicht weiter eingegangen, sondern zur eigentlichen Aufgabe der 
Variationsrechnung zuriickgekehrt werden. 

Es sei nur noch bemerkt, dass die hier gegebene Definition der Nachbar- 
schaft am nachsten kommt der Scheeffer’’ schen („Uber die Bedeutung der 
Begriffe: Maximum und Minimum in der Variationsrechnung 11 , Math. Ann. 
XXVI), in welche sie iibergeht, wenn in (37) X = x = x = t und m = n 
angenommen wird. Dann reducieren sich namlich diese Bedingungen auf die 
folgenden: 



ip^ ll \x) — ip^( x ) 



< 9n 



(m = 0,1 ,...n) , 



wo durch y = x ) und y = ip(x) die Curven a und a dargestellt werden. 




Investigations in the calculus of variations 



73 



curves x = <p[t), y = ip{t) and x = tp(t), y = i/>(t) have a common “element of 
the mth order” if and only if they make mth order contact with one another, 
where, by (27), 



D^ip(u) = D^p>{v) , D^tp{u) = D^ip(v) (/r = 0, 1, . . . m) . 



Now (36), and also (37), albeit in a different form, defines for every value 
of x a certain “ vicinity ” of the element (h) , (k)) , and hence for 

variable between t\ and a “ neighborhood of m th order v of the curve seg- 
ment a. Such “environments” and “neighborhoods” are, as is readily shown, 
continuous domains in the (m + 2)-manifold of the curve elements, namely 
totalities constituted so that each of its interior elements can always be en- 
closed in a (not infinitely small) environment that, in turn, entirely belongs to 
the totality. Hence, a curve a that, according to the definitions (36) and (37), 
is a curve “neighboring” a is one whose elements all belong to such a neigh- 
borhood domain of a, that is, each of them always to a certain “vicinity” of 
an element of a. 

Such “domains”, or “regions of the m th order” are “delimited”, according 
to (36) by means of (24) and (25), by structures of at most mth order, namely 
by equations of the form 

<P = <Z> 2 (x,y, ^ j = 0 , 



hence by differential equations of mth order whose integrals in the form of 
m-fold infinite families of curves admit a geometric realization of the “regions” 
and correspond to the “boundary curve” C of a strip of the plane mentioned 
at the beginning. But we will now return to the real task of the calculus of 
variations without further elaborating on these considerations for the present. 

Let me add only that the definition of neighborhood given here comes 
closest to Scheeffer's definition ( Scheefer 1886), into which it is transformed 
if we set A = x = x = t and m = n in (37). For these conditions are then 
reduced to the following ones: 



x ) — ip^(x) 



< 9u 



{H = 0,1 ,...n) , 



where y = ip(x) and y = ip(x) describe the curves a and a. 




74 



Zermelo 1894 



Untersu chung der ersten Variation. 



Zu den Curven a, flir welche im Fall, dass a ein Minimum liefert, J(a) > 
J(a) sein muss, gehoren alle Curven der Form: 

x = <p{t) =<p(t)+e£{t) 

V = i>(t) = i>(t) + er){t) 



mit den folgenden Eigenschaften: 

1. An den Grenzen soil 



= v M (h) = 0; ^ ) (t 2 )=0, jjW(t 2 ) = 0 (39) 

(/z = 0, 1, . . . n — 1) , 



wodurch auch flir ip(t),ip(t) die Bedingungen (32) befriedigt werden. 

2. Im ganzen Intervall t\ ^ t ^ t 2 sollen £(t) und 77(f), mithin auch 7p(t) 
und ip(t) den Stetigkeitsbedingungen (33) und (34) geniigen. 

Ausserdem soil |e| so klein gewahlt werden, dass flir t\ ^ t ^ t 2 ge- 
mass (35) 

3. + £?(t)) 2 + + er)'(t)) 2 > Y > 0 , 

32 | was, da schon a (35) geniigt: 

ip ,2 (t) +ip ,2 {t) > 7 2 > 0 

und immer £ 2 (£ ,2 (f) + ^ 0 ist, sicher erreicht werden kann durch: 

[}p ,2 {t) + ip ,2 (t)) 2 + 2e 

> 7 2 - 2\e\h = y > 0 , also |e| < , 

wenn | + ip 1 (t)r)' (t) | 5) h angenommen wird. 

Endlich soil, damit a auch zu den „benachbarten Curven 11 von a gehort, 
gleichfalls flir das ganze Intervall t\ ^ t ^ t 2 : 

4- , H (m) (0| < a m 

(^ = 0,1,2,... m) , 

wodurch (37) sicher befriedigt wird mit 

\ = x = t, A (Al) =e Pi i (/x = 1,2 ,...) . 



Dazu aber braucht nur: 

kl = 



< £e 



(/x = 0, 1 ,...m) 



angenommen zu werden, wenn 






< ft, 



[i ? 






< h u 



(ti ^ t ^ f 2 ; n = 0, 1, . . . m) 




Investigations in the calculus of variations 



75 



Investigation of the first variation. 



Among the curves a for which it must be the case that J(a) > J(a) 
whenever a furnishes a minimum are all curves of the form 

x = 7p(t) = ip(t) + s£(t) 
y = ip{t) = ip(t) + er)(t) 

with the following properties: 

1. At the limits we shall have 



^ ) (ti)=0, = 0; £M(i 2 ) = 0, V^(t 2 ) = 0 (39) 

(M = 0, l,...n-l) , 



whereby the conditions (32) are satisfied also for 

2. Both £(t) and 77(f), and hence also Tp(t) and 7 jj(t), shall satisfy the 
continuity conditions (33) and (34) in the entire interval t\ ^ t ^t 2 . 

Furthermore, |e| shall be chosen so small that, for ti ^ t ^ t 2 , in accor- 
dance with (35), 

3. (</?'(£) + eC'(i)) 2 + W>'(t) + e?/(f)) 2 > i > 0 , 
which, since a already satisfies (35): 

+ ip ,2 {t) > y 2 > 0 

and always £ 2 (£ ,2 (f) + 77 /2 (t)) A 0, can certainly be attained by means of 

(v? ,2 (f) + 7 p' 2 (t)Y + 2e + ^' (£)?/(£)) 

'Y 2 

> 7 — 2\e\h = 7' > 0 , hence |e| < — , 

provided we assume that | <p'(t)£'(t) + tf)' (t)r)' (t)\ ^ h. 

Finally, in order to ensure that a also belongs to the “neighboring curves” 
of a, we also assume that, for the entire interval t\ ^ t ^ t 2 , 

4 - |£77 (/i) (0| < 9 h 

(// = 0,1,2 ,...m) , 

whereby (37) is certainly satisfied when 

A = K = t, A (M) =e M; i (74 = 1,2,...). 



For this to be the case, it suffices to assume that 

|e| ^ (/x = 0, 1 ,...m) , 

when 









?7^(£) 



< h u 



(ti ^ ^ £ 2 ; 74 = 0, 1, . . . m) 




76 



Zermelo 1894 



ihrer Endlichkeit wegen vorausgesetzt werden, wahrend gleichzeitig die g^ 
immer beliebig klein vorgeschrieben sein diirfen. 

Setzt man nun zur Abkiirzung: 

= F(x^\y^ =F 



F p ^ (t) + e £ ^ ( t ) , (t) + eg^ (t)^ 

= F ^ x ^ , y ^ = F e , 



F dt = J , 



F e dt= J £ , 



so kann man nach den iiber F gemachten Voraussetzungen fiir hinreichend 
kleine Werte von lei nach Potenzen von e entwickeln: 



F e — F + eSF + (e) 2 , J e — J + s8 J + (e) 2 



und fiir |<SJ| > 0 



J e ~ J=A £ J = e6J[ 1+^A j j 



wo der letzte Factor fiir hinreichend kleine |e| immer positiv, das Produkt 
also durch Wahl des Vorzeichens von e beliebig auch negativ gemacht werden 
kann, sodass hier ein Minimum fiir a sicherlich nicht eintritt. Also: 

Satz I. Wenn einem der Gesamtheit A angehorenden Curvenstiick a ein 
Minimum des Inteqrales J entsprechen soil, so muss vor alien Dingen die 
„erste Variation “ SJ fiir a immer verschwinden fiir beliebige den Bedingungen 
1. und 2. geniigende Functionen £(t),g(t). 

Es ist aber: 

SJ = SFdt= Y ( X u& ] + y^) dt , (41) 

l L »= ° 

wenn, ebenso wie im ersten Abschnitte, 

OF _ dF _ on oo 



dF -Y 

dxM ~ M ’ 



= Y„, <p M (t)=x^ ^Xf) = yM 



gesetzt wird. Nun lasst sich die schon in (7a), dort freilich fiir £ = x't, g = y'r , 
angegebene Umformung: 

n 

SF^Pf + Qg + DY, [Pu&~ 1] +QuV^~ 1] ) (42) 



p„ = Y (-1 y&xp+i , Qu=Y, 




Investigations in the calculus of variations 



77 



are presupposed on account of their finiteness, while, at the same time, it is 
always possible to make the arbitrarily small. 

If we now use the abbreviations 



F =F(x^\y^ 



= F ^ x ^ + ei 7 ^^ = F e , 



F dt = J , 



F e dt = J £ , 



then, according to our assumptions about F, it is possible to expand in powers 
of £ for sufficiently small values of |e| 

F e = F + sdF + { 5)2 , Js = J + e8J + ( 2)2 (40) 

and for |<$J| > 0 



J s - J = A e J = eSJ 1^1 + jj J , 

where it is always possible to make the last factor positive for sufficiently 
small |e|, and hence also make the product negative by appropriate choice of 
the sign of e, so that there certainly exists no minimum for a in this case. 
Therefore, 

Theorem I. If a minimum of the integral J is supposed to correspond 
to a curve segment a belonging to the totality A, then, first, of all, the “first 
variation” SJ for a must always vanish for all functions £(t),?y(7) satisfying 
conditions 1. and 2. 

But 

SJ= SFdt= / £ (X^ + dt (41) 

if, as in the first section, one sets 

535-*- *>“<*> -* w - 

Now it is only possible to carry out the transformation already stated in (7a), 
albeit for £ = x't, 77 = y'r, 

n 

S F = p^ + q 71 + D Y j (p^ - 1} + Q ^ - x >) (42) 



Pn = E (-1 YD% + i , £ (-1 YD% + i 




78 



Zermelo 1894 



34 | nur ausfiihren fiir solche Werte von t, fur welche die P ;J . Q fl , also die Ablei- 

tungen x^ n+1 \ . . . x^ 2n ^ ; y( n+1 \ . . . y < - 2n '> samtlich existieren, was fiir r < 2 n 
in (33) und (34) noch nicht enthalten ist; ja, wir mrissen sie als stetige Func- 
tionen von f in einem gewissen Intervalle voraussetzen, um den bekannten 
Schluss der Variationsrechnung machen zu konnen, dass P und Q iiberall 
verschwinden rniissen. Dann aber lasst sich unter den allgemeinsten Voraus- 
setzungen streng beweisen: 

Satz II. Existieren P und Q in irgend einem Teil-Intervall t’ . . . t" als 
stetige Functionen von t, so miissen sie fiir den Fall eines Minimums im 
ganzen Intervall iiberall verschwinden. 

Ist also t\ ^ t 1 < to < t" ^ t 2 , so wird behauptet, dass unter den 
gemachten Voraussetzungen P(fo) = 0 sein muss. 

Angenommen, es ware z.B. P(fo) > p > 0 positiv, so liesse sich der 
Stetigkeit von P wegen eine positive Grosse a so klein annehmen, dass 

I P(t) - P{t 0 )\ < P(t 0 ) ~P fur \t - t 0 \ < a , 

so dass: 



P(t) = P(t o) + P(t) - P(to) ^ P(to) - \P(t) - P(to)\ 
> o) - (P(to)-p) = p, 



also: 

P{t) > p > 0 (43) 

ist im ganzen Intervall to — ot . . . to + a. 

In diesem wie in dem grosseren t’ . . . t" sind aber mit P, Q auch die iibrigen 
P/x,Qfj. stetige Functionen von t und daher die Umformung (42) gestattet, so 
dass 



1 2 

6J = f SF dt = 



’ t 0 — a 



*2 



SF dt 



SF dt 



1 1 



1 1 



to + a 



to + a 



to — a 



J2 (Pn&~ 1] +QnV {fl ~ 1) ) 

U = 1 
) + Ot- 

+ J + Qv) dt = SJi + 8J2 + SJq . 



to — Ot 



(44) 



35 | Jetzt brauchen wir nur zu setzen: 



£{t) = 0 (fi 51 1 < t 0 — a , t, 0 + a < t S 1 2 ) 

= (t - t 0 + a) r +1 (f 0 + a - t) r +1 >0 
(to — a < t < to + a) 

77(f) = 0 (fi ^ t ^ t 2 ) , 




Investigations in the calculus of variations 



79 



for those values of t for which the P p , Q M , that is, the derivatives x^ n + 1 \ . . . 
a;( 2ra ); t/(" + 1 ), . . . y^ 2n \ all exist , which is not yet included in (33) and (34) 
when r < 2 n; in fact, we must assume that they are continuous functions 
of t on a certain interval in order to draw the well-known conclusion of the 
calculus of variations, namely that P and Q must vanish everywhere. But, 
under the most general assumptions, it is then possible to furnish a rigorous 
proof of 

Theorem II. If P and Q exist in some partial interval t' . . . t" as contin- 
uous functions of t, then they must vanish everywhere in the entire interval 
in case of a minimum. 

Hence, if t\ ^ t' < to < t" if t 2 , then what is being asserted is that, under 
the assumptions made, it must be the case that -P(fo) = 0. 

Assuming, say, that P(to) > p > 0 were positive, it would then be possible, 
on account of the continuity of P, to choose a positive quantity a sufficiently 
small so that 



I P(t) - P(t 0 )| < P(t 0 ) - p for \t - t 0 \ < a , 

so that 



P(t) = P(to) + P(i) - P(to) ^ P(to) - I P{t) - P(io) I 
> P{to) - ( P(to ) -p) =p , 



and hence 

P(t) > p > 0 (43) 

in the entire interval to — a . . .to + a. 

But, along with P, Q, the remaining P M , Q M are continuous functions of t 
on this interval as well as on the greater t' . . .t" . Hence, the transforma- 
tion (42) is permissible, so that 



£2 / to — at £2 

5 J=J SF cZt = ( J 6F dt + J SF dt 

1 1 \ 1 1 to + a 

n 1 *o + « 

E (Pn&~ 1] +QnV { ^- 1) ) 

U = 1 
) + O' 

+ J H - Qv) dt = SJi + 8J2 + 8 Jo • 



(44) 



- 1 to — a 



to — a 



Now we only need to set 



f(t) = 0 (ti ^ t < t 0 - ot , t 0 + a < t ^ t 2 ) 

= (t — to + a) r +1 {t 0 + a - t) r +1 >0 
(to - a < t < to + a) 

r)(t ) = 0 (ti ^ t ^ t 2 ) , 




80 



Zermelo 1894 



wo r' A r und gleichzeitig r' n — 1 eine beliebig grosse positive ganze Zahl 
sein kann. Dann sind wegen: 

<o ± a) = 0 (0 ^ n ^ r') 

£ und ij mit ihren Ableitungen bis zur r ten Ordnung im ganzen Intervall ste- 
tige Functionen von t (gemass (33) und (34)) und geniigen den Grenzbedin- 
gungen (39), miissen also nach Satz I die erste Variation zum Verschwinden 
bringen, wahrend doch in (44) 

SJi = 0 , SJ 2 = 0 , 



also nach (43) 

to + a 

SJ = SJ 0 = j P£dt> 0 

to — a 

wird, so dass die Annahme P(to) > 0 und ebenso, wie analog gezeigt wird, 
die anderen P{to) < 0 und Q (to) ^ 0 fiir ein Minimum unmoglich sind. 

Wir konnen den Beweis auch fiihren, wenn wir nur solche Variationen 
zulassen, bei denen £, ?y im ganzen Intervall t\ . . . t 2 mit alien Ableitungen 
stetig und durch eine einzige analytische Formel ausgedriickt sind. 

Setzen wir namlich, einer zuerst von Herrn Prof. Weierstrass gegebenen 
Anregung folgend, wenn g eine von t unabhangige, spater zu bestimmende 
positive Grosse bedeutet, 

£(i) = {t- h ) n (t 2 - t) n e-e 2{t ~ io)2 ^ 0 

v(t) = 0 



fiir 



ti ^ t ^ t 2 , 



36 | wodurch alien Forderungen geniigt wird, da auch 



£ (M) (*i) = &\t 2 ) = 0 (fj, = 0, 1,. ..n- 1) , 



(39) 



so sind die successiven Ableitungen von der Form: 

£ (M) W = U^ 2 )e “' 2(t ~ to)2 , 

wo die ganze rationale Functionen ihrer beiden Argumente bedeuten. 
In den Intervallen t\ . . .to — a und to + ct ... t 2 ist nun iiberall: 

e -e (t-to) <; e ~ a s W egen \t — to| = c* , 



|<5F| ^ e Y | X M( i )^(i,£’ 2 )| ^ gi{g 2 )e , 

fi — 0 



also: 




Investigations in the calculus of variations 



81 



where r' ^ r, and, at the same time, r' ^ n — 1 can be an arbitrarily large 
positive integer. Then, on account of 

(t 0 ± a) = 0 (0 ^n^r'% 

£ and g, together with their derivatives up to the r th order, are continuous 
functions of t on the entire interval (according to (33) and (34)) and satisfy 
the limit conditions (39), and hence, by Theorem I, must make the first 
variation vanish, whereas in (44) 

S.h = 0 , SJ 2 = 0 , 

and hence, by (43), 

to + a 

SJ = 5J 0 = j P£dt> 0 , 

to — a 

so that the assumption -P(to) > 0, and also the other ones P(to) < 0 and 
Q{t 0 ) ^ 0, as is shown along similar lines, are not possible for a minimum. 

We are also able to carry out the proof if we only permit variations for 
which £, 77 , together with all derivatives, are continuous on the entire interval 
t\ . . . t 2 and expressed by a single analytic formula. 

For, if we, following a suggestion first made by Prof. Weierstrass, set 

£(i) = {t-t 1 ) n (t 2 -t) n e-<i ‘ ^0 
v(t) = 0 , 

where g denotes a positive quantity that is independent of t and to be specified 
later, for 

ti^t^t 2 , 

whereby all requirements are met, since also 

& ) (t 1 )=& ) (t 2 ) = 0 Ou = 0,l,...n— 1) , (39) 

then the successive derivatives are of the form 

&\t) =Ut;Q 2 )e~ e2(t ~ tof , 

where the denote integral rational functions of both of their arguments. 
Now, in the intervals t\ . . .to — a and to + a ■■ - t 2 , we everywhere have 

e~ B (*-* 0 ) ^ e ~a e Qn accoun t Q f |t — t 0 | = a , 



i^p| ^ e “ 2<?2 y l-M^^*? 2 )! = 9i{e 2 y a2e2 , 

fi — 0 



and hence 




82 



Zermelo 1894 



wo gi(g 2 ) die ganze Function von g 2 mit positiven Coefficienten bezeichnet, 
die man erhalt, wenn man in der Summe alle Coefficienten der Potenzen 
von g 2 durch ihre grossten Betrage im Intervall ti ^ t ^ t 2 ersetzt. 

Es wird demnach in (44) 

to — a t2 

\SJi\ ^ j |<5P| dt + j |<5F| dt < (t 2 - ti)si(e 2 )e- aV 

t 1 t 0 + a 

und ganz ebenso: 



\SJ 2 \=. 



Y p n£n-i(t,e 2 ) 

■A» = 1 



to + a 



to — a. 



% <? 2 (e 2 )e-“ V 



und 



\SJi+SJ 2 \< ((t 2 -t 1 )g 1 {g 2 )+g 2 {g 2 ))e “ 2<?2 = 5 (g 2 )e . 

Im Intervall to — a .. .to + a aber besitzt: 

Zo(t) = (t-t 1 ) n (t 2 -t) n 

eine positive untere Grenze q, so dass wegen (43) (P > p) 



37 



to + a to + a 

SJ 0 = J P£dt>pq J e~ e2{t - ta) 2 dt 

to — a to — a 

OtQ 

^ pq -z 2 i PQ / /— x ^ n 
> — > e dz = — (Jit — x) > 0 , 

Q ^ Q 

— OtQ 



wo wegen: 



+oo 

J e~ z dz = \fn 

— OO 



x fiir jedes vorgeschriebene a durch Wahl eines hinreichend grossen g beliebig 
klein gemacht werden kann. 

Jetzt kann endlich die Beziehung: 



SJi + SJ 2 

SJq 



gg(g 2 )e- aV 

pq(V n — x) 



durch Vergrosserung von g sicher erreicht werden, da 

lim gg{g 2 )e~ e a = 0 

Q— OO 




Investigations in the calculus of variations 



83 



where gi{g 2 ) denotes the entire function of g 2 with positive coefficients which 
is obtained by replacing in the sum all coefficients of the powers of g 2 by their 
greatest amounts in the interval t\ 5) t ^ f 2 . 

In (44), we therefore have 

to — a t<2 

j\SF\dt+ J\5F\dt<(t 2 ~t 1 )g 1 (g 2 )e- a2s ' 2 

1 1 to + a 



and likewise 

|<5J 2 | = 

and 



■M = 1 



t 0 + ck 



to — O' I 






\SJi +SJ 2 \ < ((t 2 - ti)gi{g 2 ) + g 2 {g 2 )) e ae = g(g 2 )e ae . 
In the interval to — ot . . . to + a, however, 

do(t) = {t-t 1 ) n {t 2 -t) n 

possesses a positive lower limit q so that, on account of (43), ( P > p) 

to + a to + a 

5 J 0 = j PS, dt > pq J dt 



to — a 



> 



™Ye-* 2 dz = — (y/n — x) > 0 , 

n Z — j n 



to — a 



PQ, 



where >r, on account of 



+ OO 

J e~ z dz = • 



can be made arbitrarily small for every prescribed a by choice of a sufficiently 
large g. 

Finally, we are now able to easily arrive at the relation 



5J\ + SJ 2 

5 Jn 



< 



gg(g 2 )e a e 

pqiV^-x) 



< 1 



lim gg(g 2 )e e “ = 0 , 

g = 00 



by increasing g, since 




84 



Zermelo 1894 



ist, und es wird schliesslich nach (44) 



SJ = SJ 0 1 + 



SJi + 8J2 
~SJn 



> 0 , 



wahrend doch 5J = 0 sein sollte, womit der Beweis des Satzes II vollendet 
ist. 

Die beiden Gleichungen P = 0 und Q = 0 hangen zusammen durch (15): 

Px' + Qy' = 0. 

Setzt man daher Py' — Qx' = G, so folgt: 

P(x' 2 + y' 2 ) = y'G , Q{x' 2 + y' 2 ) = -x'G ; (45) 



es werden also, da x' und y' nach (35) nicht gleichzeitig verschwinden sollen, 
die beiden Differentialgleichungen der Ordnung 2 n ersetzt durch eine einzi- 
ge: G = 0, die wir als „die Differentialgleichung des Problems u bezeichnen 
wollen. 

Setzt man nun: 



x'£ + y'i 7 = (x' 2 + y ,2 )v , £ = x'v + y'w , 

y't; + x'y = (a/ 2 + y' 2 )w , 77 = y'v — x'w , 

so wird 

P£ + Qv = {Px' + Qy')v + ( Py ' - Qx')w = Gw , (47) 

und daher, wenn in (42) der integrierte Teil nur angedeutet wird: 



fiir jedes Intervall t' . . .t" , fiir welches die Umformung (42) gestattet ist. 



Gw dt 




Stetigkeits-Bedingungen. 

Was Herr Weierstrass fiir das Integral: 

*2 

J = J F{x,y;x',y’) dt 

tl 

dF dF 

von den Functionen — — und — — bewiesen hat, ahnlich wie P. du Bois-Rey- 
ox' ay' 

mond (Math. Annalen XV) und Todhunter (Researches on the Calculus of 
Variations) von analogen Functionen in der gewohnlichen Darstellung, lasst 
sich folgendermassen verallgemeinern: 




Investigations in the calculus of variations 



85 



and, by (44), eventually 



8 J = 5 J 0 



/ SJi + 8J2 \ 

l SJ 0 ) 



> 0 , 



while we ought to have 8J = 0, which completes the proof of theorem II. 

The two equations P = 0 and Q = 0 are connected with one another 
via (15): 



Px' + Qy' = 0. 



If we therefore set Py' — Qx' = G, it then follows that 

P{x' 2 + y' 2 ) = y'G , Q(x' 2 + y' 2 ) = -x'G ; (45) 

thus, since x' and y' are not supposed to simultaneously vanish according 
to (35), the two differential equations of order 2 n are replaced by a single 
one: G = 0, which we shall call 11 the differential equation of the prohlem , \ 

If one now sets 

x'f, + y'v = ( x ' 2 + y' 2 )v , £ = x'v + y'w , 

y't ; + x'y = (x 12 + y' 2 )w , 77 = y'v — x'w , 

then 

-PC + Qi 1 — (Px' + Qy')v + (Py' — Qx')w = Gw , 

and hence, if the integrated part in (42) is merely indicated, 

v 

for every interval t' . . . t" for which the transformation (42) is permissible. 



Gw dt. 



(46) 

(47) 



Continuity conditions. 
What Weierstrass has shown for the integral 



t 2 

J = J F(x,y;x',y')dt 

tl 



of the functions — — and — — , like P. du Bois-Reymond ( 1879a, b) and Tod- 
ax' oy' 

hunter (1871 ) in the case of analogous functions in the usual representation, 
can be generalized as follows: 




86 



Zermelo 1894 



Satz III. Fur ein Curvenstiick a, das den Bedingungen (32) bis (35) 
der A geniigt und das ein Minimum des Integrates liefern, oder allgemeiner, 
dessen erste Variation in der oben angegebenen Weise verschwinden soil, diir- 
fen die Grossen Pi, P 2 ■ ■ ■ P n = X n ; Q 1, Q 2, . . . Q n = Y n , als Functionen von t 
betrachtet, an einer Stelle to keine endlichen Spriinge erleiden, wenn zu bei- 
den Seiten von to in einer beliebig kleinen Umgebung t' . . . t" die Functionen 
x, x ' , . . . a++ y, y' , . . . y( 2n l und damit auch P, Q stetig sind. 

Um diesen Satz fur eine dieser Grossen P\ zu erweisen, nehmen wir eine 
Variation £, 77 von der Beschaffenheit, dass 

| 77(f) = 0 (fi ^ t ^ 1 2 ) 

^(t) = 0 (ti ^ t ^ t', t"^t^ 1 2 ), 

=^-i)(f") =0 (p = 1,2,.../ + 1 ^ n) (48) 



und ^(t) auch im Intervall t' . . . t" mit seinen Ableitungen bis zur r' ten Ord- 
nung stetig ist. 

Dann ist: 

t2 ^0 t 

SJ = f SF dt = 



hi 



t' t 0 

-\to — 0 to — 0 



E P + (M_1) 

Lpt = 1 



Gw dt 



E p + 

— 1 



t' t' 

- t" t" 

(/*-!) J 

■ to + 0 to + 0 



Gw dt , 



also wegen G = 0 in den beiden Intervallen t' . . . to— 0 und to+0 . . . t" (Satz II) 
und wegen (48): 



6,J = 



E P + (M ~ 1} 

.m = 1 



to — 0 



— P\(t-o — 0) — P\{to + 0) 



to+0 



und da nach Satz I SJ = 0 sein muss: 



P A (t 0 -0) = P A (t 0 + 0) (A = 1, 2, ... n) 

q. e. d. 



Doch wurde bei diesem Beweise vorausgesetzt, dass P\ auf beiden Seiten 
von to nach bestimmten Grenzwerten P\(to A 0) convergiert, was z. B. fiir 




Investigations in the calculus of variations 



87 



Theorem III. Given a curve segment a that satisfies the condi- 
tions (32) - (35) on the A and furnishes a minimum of the integral, or 
more generally, whose first variation shall vanish in the manner specified 
above, the quantities P\, P 2 ■ ■ ■ P n = X n ; Qi, Q2, ■ ■ ■ Qn = Y n considered as 
functions of t may not suffer any finite jump discontinuities at a position to 
if, on either side of to, the functions x, x' , . . . x^ 2n ^ ; y,y’ , . . . y( 2n l , and hence 
also P, Q, are continuous on an arbitrarily small vicinity t' . . .t" . 

In order to prove this theorem for one of these quantities P\, we assume 
a variation £, 77 constituted so that 

77(f) =0 (fi ^ t ^ t 2 ) 

m = 0 t"^ts t 2 ), 

= £0*-i)( t ") = 0 {p = 1, 2, ... r' + 1 ^ n) (48) 

and £(f), together with its derivatives up the r' th order, is continuous on the 
interval t' . . .t" . 

Then 



t 2 to t" 

SJ = [ SF dt = [ + [ 






t' to 

-1 to — 0 — 0 

- 1) 



M = 1 



Gw dt 



v t ' 

t" t" 



E P ^ _1) 

■M = 1 



Gw dt 



to + 0 to 0 



and hence, on account of G = 0, in the two intervals t’ . . . to — 0 and to+0 . . . t n 
(Theorem II) and, on account of (48), 



SJ = 



-1 to — 0 



L/j = 1 



— Pa (to ~ 0) — P\(to + 0) 



J to +0 

and since, by Theorem I, necessarily SJ = 0: 



Pa (to - 0) = P A (to + 0) (A = 1,2, ...n) , 

q. e. cl. 



In this proof, however, we took for granted that, on either side of to, P\ 
converges to certain limits Pa (to T 0), which, e. g., is not the case for the 




Zermelo 1894 



40 



die Function sin - an ihrer einzigen Unstetigkeitsstelle t = 0 nicht stattfin- 

det; ferner wurde eine Variation | £(f) benutzt, deren hohere Ableitungen an 
den Stellen t',t" unstetig sind, und es fehlte schliesslich noch der Beweis fiir 
die Existenz einer Function £(f) mit den angenommenen Eigenschaften im 
Intervall t' ^ t ^ t". Jetzt soil der Beweis von alien diesen Bedenken befreit 
werden durch Benutzung ganz ahnlicher Hilfsmittel wie der zum Beweise des 
vorigen Satzes II angewendeten. 

Es seien a, f3 beliebig kleine positive Grossen, 



t\ ^ t! < t 0 - a < t 0 < t 0 + < t" ^ t 2 



und es werde vorausgesetzt, dass in den beiden Intervallen t'...to—a und to + 
/3 . . . t" , wie klein auch a und j3 genommen werden, P und Q stetig verlaufen, 
also nach Satz II bestandig verschwinden, dann kann, so wircl behauptet, 
fiir a, /3 eine obere Grenze ao so angegeben werden, dass 

\Px(t 0 + p) - P x (t 0 ~ a)\ <£ (49) 



fiir ein beliebig klein vorgeschriebenes £ unter der Voraussetzung des Ver- 
schwindens der ersten Variation. 

Wir betrachten dazu die folgende jedenfalls erlaubte Variation: 



r] = 0 

C = C(*; q) = Co (t, g)e~ e[t ~ to)2 (g > o) , 

wo Co {t, g) eine ganze rationale Function von t und g sein soil und so beschaf- 
fen, dass 

&~ 1 \t 1 )= C ( ' i - 1) (< 2 ) = 0, (50) 

= = 0, 

C ( ^- 1) (t 0 ) = e^A (fj, = 1,2, .. .n) . 

Den ersten Bedingungen wird geniigt durch den Ansatz: 

Co (t, q) = (t — G nt 2 - t) n (t - t'nt" - Q ) 

= h{t)<p(t, g) , 

und es bleibt nur noch die ganze Function tp(t, g) so zu bestimmen, dass auch 
die letzte Bedingung befriedigt wird, namlich 



41 



= D^~ 
m - 1 

= £ 

i = 0 

M — 1 

= £ 

i = 0 






M ~ 1 
i 



(t = to) 
h{t)e ~ e{t - to)2 



M - 1 
i 



(p ( ' l \t 0 ,g)h f _ l - i - 1 (t 0 ,g) , 




Investigations in the calculus of variations 



89 



function sin — at its only discontinuity at t = 0; furthermore, we used a vari- 
ation £(f) whose higher derivatives are discontinuous at the positions t' ,t" . 
Finally, what was missing was a proof of the existence of some function £(f) 
with the asserted properties in the interval t 1 ^ t ^ t" . In what follows, the 
proof shall be rid of all these concerns by use of expedients very similar to 
those used for the proof of Theorem II. 

Consider arbitrarily small positive quantities a, 3 such that 

f 1 ^ t' < t 0 — a < t 0 < t 0 + 3 < t" 5= f 2 

and assume that P and Q are continuous on both intervals t' . . . to — a and 
to+p . . . t" , however small a and 3 may be, and hence, by Theorem II, always 
vanish. Then, so it is asserted, it is possible to specify an upper limit an 
for a, 3 so that 

\Px(to + 3)-P\(to-a)\ <e (49) 

for an arbitrarily small prescribed e, provided that the first variation van- 
ishes. 

For this purpose let us consider the following variation, which is certainly 
admissible: 

?7 = 0 

C = €{t;e) = Co {t,g)e~ s(t ~ t °'> 2 (f? > 0) , 

where Co(C <?) is supposed to be an integral rational function of t and g so 
constituted that 

&- 1 \t 1 ) = ^~ 1 \t 2 ) = 0, (50) 

£('*- 1 >(i') = f(**- 1 >(t") = 0 , 

C (m_1) (*o) = e^x (m= l,2,...n) . 

The first conditions are met by the ansatz 

Co (t, g) = (t — hnt 2 - tnt - trtt" - t)>(c e) 

= g) , 

and the only task remaining is to determine the entire function ip(t, g) so that 
the last condition, too, is satisfied, namely 

e M ,A = ~ 1 h(t)(p{t,g)e~ e{t - to)2 , {t = t 0 ) 

= T l \ {i) {t,g)D> 1 - i - 1 \h{t)e-^ t - to '> 2 

i=o ' 1 ' L J 

= (^ • 1 )^ W ( i o,£’)^-i-i( < o.£’) > 

i — 0 V * / 




90 



Zermelo 1894 



wenn 

D i \h(t)e ~ e(t - to)2 1 =hi(t ie )e - e{t - to) 2 

gesetzt wird, wo auch hi eine ganze Function ist und der Exponentialfactor 
fiir t = to sich auf 1 reduciert. 

Nun lassen sich die vorliegenden Gleichungen fiir /x = 1, 2 ... n successive 
auflosen nach den Unbekannten 

v>(to,e), v'(t 0 ,g) = 9 ^ , 

J t = to 

da immer die hochste Ableitung 1 ^(to, q) den Coefficienten ho{t, g) = 
h(to) > 0 besitzt. So ergeben sich diese Grossen ~ ^(to, g) samtlich als 
ganze Functionen von g, und man braucht nur noch 

n — 1 1 

P(t,g)= 

i — 0 

zu setzen, um eine Function 

zu erhalten, die in der That alien Bedingungen (50) geniigt und deren Ablei- 
tungen von der Form sind: 

&\t)=Ut,Q)e - e{t ~ to)2 • 

Ganz dieselbe Methode unter der vereinfachenden Annahme g = 0 lies- 
se sich anwenden zur Bildung der vorhin benutzten Function £(t) mit den 
Eigenschaften (48). 

42 | Fiir unsere Variation Sx = £(t, g), Sy = 0 wird nun: 

*2 

SJ = J SF dt 

tl 




SJ\ + 6 J2 + • J 

Ahnlich wie beim Beweise von II lasst sich jetzt eine ganze rationale Func- 
tion g(g) so angeben, dass fiir das ganze Intervall t\ ^ t ^ t 2 

n n 

|AF|= =e J2 x ^)Ut,g) 

/i — 0 /i — 0 

< e -^ t - t o) 2 g(g) 



(52) 




Investigations in the calculus of variations 



91 



if we set 

D l \h{t)e ~ e{t - to)2 j =hi{t,Q)e- e{t - io)2 , 

where hi, too, is an entire function and the exponential factor for t = to is 
reduced to 1. 

Now it is possible to solve the equations at hand for /x = 1, 2 . . . n succes- 
sively for the unknowns 

<p(to,Q), v'(.to,g)= gS> ,---^ n ~ 1 \to,g) , 

1 m Jt=i 0 

since the highest derivative (£ 0 . always possesses the coefficient 

h 0 (t,g) = h(to) > 0. Thus arise all these quantities ~ g) as entire 
functions of g , and we only need to set 

n = 1 1 
i = 0 

in order to obtain a function 

m = =Ut,Q)e ~ e{t - to)2 

which indeed satisfies all conditions (50) and whose derivatives are of the 
form 

It is possible to use precisely the same method under the simplifying 
assumption g = 0 in order to form the function £(f) used above with the 
properties (48). 

In the case of our variation Sx = £(f, g), 5y = 0 we now have 
*2 

S.J = J SF dt 

tl 




— SJi + SJ 2 SJ 3 ■ J 

Similar to the proof of II, it is now possible to specify an entire rational 
function g(g) so that, for the entire interval t\ ^ t ^ 1 2 , 

n n 

1 ^ 1 = = e -^“ to)2 

fi — 0 /i = 0 

< e~ e ^~ to ^ 2 g{g) 



(52) 




92 



Zermelo 1894 



wird, da nach unserer Annahme die X M auch an der Unstetigkeitsstelle endlich 
bleiben sollen. Fiir (t — to) 2 A r 2 , also in den Intervallen t\ . . A' und t" . . . t 2 , 
wenn r die kleinere der Differenzen to — t' und t" — to bezeichnet, wird daher: 

\SF\ < e ~^ 2 s(e) 



und 



|<5Ji| 




ti)g{g)e er2 




(53) 



fiir ein beliebig kleines e, wenn bei vorgeschriebenem r der Parameter g 
hinreichend gross genommen wird. 

Ferner wird fiir to — a<t<to + P- 

\SF\<e-^-^ 2 g(g)^g(g) 



wegen g > 0, und daher kann auch fiir das eben bestimmte g noch der zweite 
Teil in (51) 



\SJ 2 \ 



to + P 



SF dt\ 



< g{g)(a + P) < 2a 0 g{g) < - 



Po — at 



(54) 



43 | gemacht werden durch hinreichend klein gewahltes a o und 

ol < ol o , /? < cto ■ 

In den Intervallen t ' ... to — ol und to + f3 .. A" endlich sind nach der 
Annahme alle stetige Functionen von t und daher ist die Formel (42) 
anwendbar, so dass 



5 Jo = 






M = 1 



to — a to — a 



Pl ; dt 



E 

^=1 



v f 

t" t" 

-i) I 



dt. , 



to + P *0 + 0 



oder, wegen P = 0, da sonst nach II gewiss kein Minimum stattfinden konnte, 
mit Hilfe von (50) 



6 Jo 



' 



E P ^ (M_1) 

L m = l 



J to + /3 




Investigations in the calculus of variations 



93 



since, by our assumption, the X M are supposed to remain finite at the dis- 
continuity as well. For (t — to) 2 ^ r 2 , and hence in the intervals t\ . . . t' and 
t" . . .t 2 , where r denotes the smaller of the difference to — t' and t" — to, 
therefore 

\SF\ < e-^ 9 ^ 

and 

t' t2 

\SJi\ = J + J < (t 2 - h ) 9 (g)e~ 9T2 < | , (53) 

ti t" 

given an arbitrarily small e, if we take the parameter g sufficiently large for 
prescribed r. 

Furthermore, for to — a < t < to + /3, 

\6F\ < e~ e{t ~ to) \(g) ^ g{g) , 

on account of g > 0, and therefore it is also possible to make the second part 
in (51) 

io + /5 

|(5J 2 | = J SF dt < g(g)(a + /3) < 2a 0 g(g) < ^ (54) 

to — a 

for the g just determined by choice of a sufficiently small ao and 

oi < oi o, (3 < a 0 ■ 

Finally, all are, by assumption, continuous functions of t in the intervals 
t' . . . to — a and to + (3 ... t" , and the formula (42) is therefore applicable so 
that 




or, on account of P = 0, since otherwise, by II, there would certainly be no 
minimum, by means of (50) 



SJ 3 = 




to — a 
to + P 




94 



Zermelo 1894 



und 



SJ 3 = P\(to - ot) - Pa (to + /?) + i>(Q,O!,0) , (55) 



44 



WO 



V’te.a,/?) = 



n 

S>(f 



(m- 1) 



L M = 1 



-| to — OL 




J to +P 



wegen (50) 

lim ^ ~ :) (t 0 - a) = lim ~ 1} (t 0 + /3) = ~ x) (t 0 ) =e /it \, 

a=0 p = 0 

und weil auch die P M bei der Annaherung an to nicht iiber alle Grenzen 
wachsen sollen, durch Verkleinerung von a und f} dem Betrage nach beliebig 
klein gemacht werden kann, also auch 

\i>(Q,a,/3)\ < ^ . (56) 

Nun soil aber: SJ = 6 J\ + SJ 2 + SJ 3 = 0 sein (51) auch fur die hier 
betrachtete Variation £, also nach (55) 

| Px (to + /?) - Pa (to - a)| ^ \ 6 Ji\ + \SJ 2 \ + j3)\ 



und nach (53), (54), (56) < i + i + ^ = £ ’ 

wie in (49) behauptet, wenn fiir ein beliebig klein vorgeschriebenes e die obere 
Grenze an von a und (3 hinreichend klein genommen wird. 

Die Giiltigkeit dieses jetzt vollstandig bewiesenen Satzes III stiitzt sich 
jedoch auf die Voraussetzung, dass gemass (34) (vergl. die dort gemachte 
Bemerkung) die (y = 0,1,... r) selbst im 

ganzen betrachteten Intervall von a iiberall stetig verlaufen, nicht nur ihre 
Osculations-Invarianten. Im anderen Falle wiirden die dem Beweise zu Grun- 
de gelegten Variationen oft nicht mehr zu den „erlaubten“ gehoren, und es 
waren nicht mehr die P M , Q M , sondern andere Functionen, deren ausnahms- 
lose Stetigkeit behauptet werden konnte. Solche Ausdriicke aber erhalt man 
z. B., wenn man fiir t die Bogenlange s von a oder allgemeiner eine mit ihren r 
ersten Ableitungen stetige Function von s als unabhangige Variable einfiihrt 

x d^ y 

und demgemass in den P ;J , Q p die Argumente x^\ y^) durch die 

ersetzt, die sich ihrerseits wieder durch die a M und auch durch die x^\ y^ 
ausdriicken lassen. Doch soli hierauf jetzt nicht naher eingegangen werden. 

Bisher war die Anzahl r der ersten Ableitungen x^\ y <,l K deren Stetigkeit 
in (34) fiir alle Curven A gefordert wurde, noch unbestimmt gelassen worden 
und die Satze I, II, III fiir beliebiges r abgeleitet. Es wird jetzt aber gezeigt 
werden, dass 



r > n — 1 




Investigations in the calculus of variations 



95 



and 



SJ 3 = P\(t 0 - a) - Pa (to + P) + i’ie, P) , (55) 



where 



ip(g,a,P) = 




(m- t) 



-| to — a 




J *o + /3 



on account of (50), 

lim “ 1} (t 0 - a) = lim “ 1} (t 0 + p) = " 1} (to) = e„, A , 

a = 0 /3 = 0 



and because the P^ are not supposed to increase beyond all limits either as 
they approach to, can be made arbitrarily small with respect to its absolute 
value by decrease of a and P, and hence also 

\^{e,OL,P)\ < | . (56) 

But now we are supposed to have SJ = 8 J\ + 8 J 2 + SJ 3 = 0 (51) also for 
the variation £ under consideration here, and hence, by (55), 



|Px(to + P) - -Pa ( to - a) | ^ |<5Ji| + I&/ 2 I + \4>(q, a,P)\ 



and, by (53), (54), (56), < i + I + I = £ ’ 

as asserted in (49), if, given an arbitrary small prescribed e, the upper limit «o 
of a and /3 is taken sufficiently small. 

But the validity of Theorem III, whose proof is now complete, is based 
on the assumption that, according to (34) (cf. the observation made there), 
not only their osculation invariants but the x ^ y ^ = ip^\t) 

(y = 0, 1, . . . r) themselves are everywhere continuous on the entire interval 
of a under consideration. For otherwise, the variations on which the proof 
is based oftentimes would no longer belong to the “admissible” ones, and it 
would no longer be the P ;i , Q M whose strict continuity could be asserted but 
that of other functions. But such expressions are obtained, e.g., when for t 
the arc length s of a, or, more generally, a function of s that, together with 
its first r derivatives, is continuous, is introduced as an independent variable 

x d^y 

and when, accordingly, the arguments x^\ are replaced by — — , — — 

asv as v 

in the P^, Qn , which, in turn, can be expressed in terms of the as well as 
the x^\ 2 / • We shall not, however, elaborate on this matter at this point. 

Up to now, the number r of the first derivatives x^\ whose continu- 
ity was demanded in (34) for all curves A has remained indeterminate, and 
Theorems I, II, III have been derived for arbitrary r. Now, however, we shall 
show that 



r > n — 1 




96 



Zermelo 1894 



angenommen werden muss, wenn in allgemeineren Fallen ein Minimum iiber- 
haupt moglich sein soil. 

Satz IV. Fiir jedes Integral J = J F dt und jedes Curvenstiick a Idsst 
sich immer eine solche Variation £,?y angeben, deren Ate Ableitungen 

(A ^ n — 1) an einer einzigen St.elle t = to endliche Spriinge erleiden 
diirfen, und welche der ersten Variation | einen von 0 verschiedenen Wert 
erteilt und damit ein Minimum unmoglich macht, vorausgesetzt, dass nicht 
in der ganzen Ausdehnung von a die Functionen P und Q unstetiq oder P\ 
und Q\ immer = 0 sind. 

Der Beweis kann auf den Fall beschrankt werden, wo die Forderungen 
der Satze II und III erfiillt sind, da sonst die Unmoglichkeit eines Minimums 
schon feststande. 

In einem Teil-Intervall t' . . .t " , in welchem P und Q stetig sind, nach II 
also liberall verschwinden, werde to so gewahlt, dass P\ g* 0 ist, und folgende 
Variation betrachtet: 



T] = 0 t”) 

£ = €i(t) = {t -t') n ipi{t) it' £ t < to) 

= Ut) = (t"-t) n Mt) (t 0 <t^t"), 

wo ipi (t) und ip 2 (t) ganze Functionen sind und so beschaffen, dass ausser 

£^ _1) (t') = £^ _1) (t") = 0 (57) 



noch 



{p= 1,2, ...n) , 



was immer angenommen werden darf, da nach dem fiir (50) angewandten Ver- 
fahren den 1 \to) und ^ ^(to) beliebige Werte vorgeschrieben werden 

konnen. 

Dann wircl: 



to — 0 t 

SJ = j SFdt+ j SF dt 

t’ t o + 0 

j to — 0 

tin- i) 



E 

M — 1 



— 1 



to 0 



wahrend die Integrale J Pt; dt wegen P = 0 verschwinden. Nach (57) und (58) 
mit Rucksicht auf die Stetigkeit der P^, die nach III auch fiir Unstetigkeits- 




Investigations in the calculus of variations 97 

must be assumed for a a minimum to be possible at all in the more general 
cases. 

Theorem IV. For any integral J = J F dt and any curve segment a, 
it is always possible to specify a variation £,77 whose Xth derivatives 

(A E n — 1) may suffer finite jump discontinuities at a unique point 
t = to, and which gives the first variation a value different from 0, thereby 
rendering a minimum impossible, provided that the functions P and Q are 
not discontinuous on the entire extension of a or P\ and Q\ always = 0. 

The proof is capable of restriction to the case in which the demands of 
the Theorems II and III are satisfied, since otherwise we would already have 
determined that a minimum is impossible. 

Given a partial interval t' . . . t" on which P and Q are continuous, and 
hence, by II, vanish everywhere, let us choose to so that P\ ^ 0 and consider 
the following variation: 



77 = 0 {If ^ t ^ t") 

k = Ci (t) = {t- t') n <p\(t) (t' St < t 0 ) 

= &(t) = (t” - t) n ip 2 {t) (to < t ^ t") 



where tp\(t) and tp 2 (t) are entire functions constituted so that, in addition to 



& 1} (*')=& = 0, (57) 



also 



^ 1} (io)-d M 1) (io)=C ( ' 1 " 1) (io-0)-C (#i - 1) (to + 0) = e /i , A (58) 

(V = 1 , 2 ,...n) , 

which may always be assumed, since, according to the procedure used for (50), 
it is possible to assign arbitrary values to the ^ 1 '*(<o) and 1 '*(to)- 

Then 



to — 0 t" 

6J = f SFdt+ f 5F dt 

t' t 0 + 0 



E p ^ 

M — 1 



(M- 1) 



to — 0 



E 

,/j = i 



J io + 0 



while the integrals / P£ dt vanish on account of P = 0. But, by (57) and (58), 
and considering the continuity of the P tl , which, according to III, must be 




98 



Zermelo 1894 



stellen bestehen bleiben muss, wird dies aber 



46 



SJ = 



-i to — 0 



L ju = l 



J to + 0 



'll 

E P t^ o)(e ( ^ 1} (io - O)-^- 1 )^ + 0)) 



M = 1 



= p n(to)eti,\ = P\(to) ^ 0 , q.e.d. 

p = i 



Der Beweis liesse sich noch verscharfen durch Anwendung des fiir III 
benutzten umstandlicheren Verfahrens, was aber hier ohne Interesse ware. 

Auf Grund dieses Satzes IV wird im Folgenden von alien „erlaubten“ 
Curven die durchgangige Stetigkeit der Ableitungen x^\ bis zur p = 
r = n — 1 ten Ordnung iminer vorausgesetzt werden. Beriicksichtigt man 
aber auch hier wieder die Willkiirlichkeit der „Darstellung“, die von Teil zu 
Teil wechseln kann, so brauchen thatsachlich nur die Osculations-Invarianten 
n — 1 ter Ordnung iminer stetig zu sein; die einzelnen Teile miissen einander 
immer von mindestens n — 1 ter Ordnung beriihren. 

Die Untersuchung der „ersten Variation 11 , die zu den Satzen I bis IV fiihrte, 
wird in den meisten Fallen zur Auffindung der Curven genii gen, die unter 
den vorgeschriebenen Bedingungen fiir ein Minimum iiberhaupt in Betracht 
kommen. Nach Satz II miissen sie im allgemeinen der „Differentialgleichung 
des Problems “ G = 0 geniigen, die von der Ordnung 2 n ist und ein allgemeines 
Integral besitzt von der Form: 



X = ip(t; U!,U 2 , . . . U 2n ) = <p(t,u) 
y = UI, U2, ■ ■ ■ u 2n ) = 1p{t, u) , 



(59) 



wo die u v (y = 1,2,... 2 n) willkiirliche Constanten sind und in ihrer Gesamt- 
heit durch u angedeutet werden mogen, wahrend 



u) 

dt» 






dip(t, u) 
duv 



<Pv{t,u) 



u. s. w. gesetzt werde. 

47 | Wird F als analytische Function der y ^ vorausgesetzt, so sind 

auch G und claher auch ip und in Bezug auf ihre samtlichen Argumente 
analytische Functionen. 

Eine Curve a, die ein Minimum liefern soil, muss nach Satz II aus ei- 
ner Anzahl particularer Losungen der Differentialgleichung zusammengesetzt 
sein, also von der Form (59), wo die u v in den einzelnen Teilen constante Wer- 
te besitzen. An den Ubergangsstellen aber, an denen die u u endliche Spriinge 
erleiden konnen, miissen ausser den Bedingungen (34), wo nach IV r = n — 1 
zu setzen ist, noch gemass III die P ;J , Q M (/i = 1, 2, ... n) stetig bleiben. 




Investigations in the calculus of variations 



99 



preserved also for discontinuities, this becomes 

n 1 *0 — 0 



S.J = 






M — 1 



to 0 



71 

E P m(*o) + 0)) 



M — 1 



= E P /^o)GuA = P\(to) ^ 0 , q.e.cl. 

A» = 1 



It is possible to further strengthen the proof by applying the more involved 
method used for III. But this is irrelevant to our present purposes. 

Because of Theorem IV, we shall, in what follows, always assume that, 
given any “admissible” curve, the derivatives x^\ up to the y = r = 
n — 1th order are always continuous. In fact, taking into account again the 
arbitrariness of the “representation”, which may vary from one part to an- 
other, only the osculation invariants of n — 1 th order need to be always 
continuous; the individual parts must always have contact of at least n — 1 th 
order with one another. 

For most cases, the investigation of the “first variation”, which led us to 
Theorems I- IV, suffices in order to find the curves that, given the prescribed 
conditions for a minimum, are at all relevant. Generally, by Theorem II, they 
must satisfy the 11 differential equation of the problem ” G = 0, whose order 
is 2 n and which possesses a general integral of the form 

X = <p{t\Ui,U2,---U2n)= 

(59) 

y = ip(t; ui, u 2 , ■ ■ ■ u 2n ) = 4>{t, u) , 



where the u v (v = 1 , 2 , . . . 2 n) are arbitrary constants, which shall, summarily, 
be denoted by u, while we shall set 



u) 

dtM 



tp M {t,u) , 



dy{t, u) 

du u 



<Pu{t,u) 



e. t. c. 

If F is an analytic function of the x^\ y^\ then G, too, and hence also ip 
and 0, are analytic functions with respect to all their arguments. 

A curve a that is supposed to furnish a minimum must, by Theorem II, be 
composed of particular solutions of the differential equation, and hence must 
be of the form (59), where the u u have constant values in the individual parts. 
But at the comers, at which the u v may suffer finite jump discontinuities, 
the P /t , Q fj, (y = 1,2,... n) must also remain continuous according to III, in 
addition to the conditions (34), where, by IV, we have to set r = n — 1. 




100 Zermelo 1894 



Diesen Bedingungen zusammen mit den „Grenzbedingungen“ (32) wird, 
wie aus der Anzahl der verfiigbaren Constanten hervorgeht, im allgemeinen 
nur auf eine endliche Anzahl von Weisen geniigt werden konnen, die nunmehr 
einzeln untersucht werden iniissen. 

Unser Problem hat sich also auf die Frage reduciert, ob ein vorgeschrie- 
benes, den bisherigen Bedingungen geniigendes Curvenstiick a ein wirkliches 
Minimum des Integrates liefert.. Dann miissten es notwendig auch seine ein- 
zelnen Teile, die ja auf Grund von (34) (r = n — 1) unter festgehaltenen 
Osculations-Invarianten n — 1 ter Ordnung an den beiden Endpunkten, al- 
so unter den Grenzbedingungen (32), einzeln variiert werden konnen (der 
Fall r > n — 1 soil jetzt nicht mehr beriicksichtigt werden), und erst wenn 
diese Eigenschaft von den einzelnen Teilen bewiesen ist, wird iiber die Zu- 
lassigkeit der Zusammensetzung entschieden werden konnen. Es kommt also 
alles an auf die Beantwortung der Frage: 

Unter welchen Bedingungen entspricht einem von singularen Punkten frei- 
en Stuck 1 2 einer particularen Losung (59) der Differentialgleichung des Pro- 
blems gemass der aufgestellten Definition ein wirkliches Minimum des zwi- 
schen seinen Grenzen erstreckten Integrates? 

Diese Aufgabe nach der Methode von Herrn Prof. Weierstrass zu losen, 
soil in den folgenden Abschnitten versucht werden. 



Drifter Abschnitt. 

Einfiihrung der Function E. 



Es sei die Curve a 



x = ip(t) = ip(t, a) = ip(t; ax,... a 2n ) 
y = ip(t) = ip(t, a) = ip(t ; ax,... a 2n ) 

nach (59) ein durch u u = a v (y = l,2,...2n) charakterisiertes besonderes 
Integral der Differentialgleichung des Problems und moge sich im ganzen 
Intervall 1 2 (ti ^ t ^ t 2 ) in der Weise regular verhalten, dass ip^ft), 
sowie auch ipif'* ( t ), ( t ) (p, v = 0, 1 , . . . 2 n) iiberall eindeutig, endlich und 

stetig sind, und dass ausserdem noch fur n > 1 bei constantem 7 bestandig 

ip ,2 {t)+ip ,2 (t) >7 2 >0. (35) 

Innerhalb dieses Intervalls werden zwei Punkte 0 und 3 angenommen, fiir 
welche t die Werte to und £3 besitzen moge, also 

tx ^ to < t.3 = t 2 

und, ebenso wie von Herrn Prof. Weierstrass fiir den Fall n = 1, folgende 
Variation 043 des Curvenstrickes 03 betrachtet: Durch x = ^(A), y = A) 




Investigations in the calculus of variations 



101 



As is evident from the number of available constants, these conditions, 
together with the “limit conditions” (32), can in general only be satisfied in 
a finite number of ways, each of which shall now be investigated. 

Our problem has therefore been reduced to the question as to whether 
a prescribed curve segment a satisfying the previous conditions furnishes a 
real minimum of the integral. If so, then its individual parts would have do 
to do so as well, each of which can be varied individually because of (34) 
(r = n — 1) with fixed osculation invariants of n — 1 th order at the two 
endpoints, and hence under the limit conditions (32) (we shall no longer 
consider the case r > n — 1). Only once this property has been proved for the 
individual parts will it be possible to decide on the question as to whether 
the composition is admissible. Hence, it all turns on the question, 

What are the conditions under which to a piece 12 of a particular so- 
lution (59) of the differential equation of the problem without any singular 
points there corresponds a real minimum of the integral taken between its 
limits in accordance with the present definition? 

In the following sections, we shall try to solve this problem using the 
method by Prof. Weierstrass. 



Third section. 

Introduction of the function E. 

Suppose that the curve a 

x = ip(t) = c p(t , a) = i p{t\ oi, . . . a 2n ) 
y = 4>{t) = if>(t, a) = if(t; ax,... a 2n ) 



is a particular integral of the differential equation characterized by u u = a„ 
( v = 1,2,... 2 n) according to (59) and that it is regular on the entire inter- 
val 1 2 (ti ^ t ^ t 2 ) such that ip^(t), ip^(t), as well as c p[f\t), 

(/z, v = 0, 1, . . . 2 n), are everywhere single- valued, finite and continuous, and, 
for n > 1, with constant 7 always 

<p' 2 {t)+ip' 2 {t) >7 2 >0. (35) 

Consider two points 0 and 3 in the interior of this interval for which t 
takes the values to and to respectively, and hence 



tl = to < t3 ^ t 2 . 



Like Prof. Weierstrass did for the case n = 1, we now consider the following 
variation 043 of the curve segment 03: Suppose that a is a second curve given 




102 Zermelo 1894 



werde eine zweite Curve a gegeben, welche die erste a im Punkte 3 von 
n— 1 ter Ordnung beruhren und sich in der Umgebung dieses Punktes ebenfalls 
bis zu den Ableitungen ?rter Ordnung stetig verhalten soil. Dann lasst sich 
nach (30) eine mit beliebig vielen Ableitungen stetige Function A = A (£), oder, 
was auf dasselbe hinauskommt, ein solches System von Werten A ^ = A ^ (£3) 
(n = 0, 1, ... n — 1; A' > 0) angeben, dass fur t = £3: 

I = M (t 3 ) = DM A) = M M = M 

yf ] = Vh^M = DM A) = MM = M 
(m = 0, l,...n— 1) , 

wenn 

M) = V>[A(t)] = Tp x (t) , ip(X) = ip[X(t)\ = (61) 

gesetzt wird, wo auch die Functionen Tp 1 , ip 1 sich in der Umgebung von t = £ 3 
regular verhalten. Diese Functionen tp 1 und damit die Curve a konnen 
immer so angenommen werden, dass nicht nur die Bedingungen (60) erfiillt 
werden, sondern auch die in Bezug auf das Argument t genommenen Ablei- 
tungen: 

M(t 3 ) = 4 n) > Vh^M = vt ] ( 60a ) 

beliebig vorgeschriebene, von <p ( ' n \ts) = x s n \ Mm = y g"'* verschiedene 
Werte annehmen. 




Nun werde in der Nahe von 3 auf der Curve a nach riickwarts ein Punkt 4 
angenommen mit den Coordinaten: 

x = X 4 = ^(£ 3 - £) = <p [A(£ 3 - e)] = - t) 

V = V4 = tpi{t 3 - e) = ip [A(£ 3 - e)] = ip( A - l) , 
wo 

l = A (£3) — A(£ 3 — e) = X' (ts)£ + (5)2 = X' e + (e) 2 , 
und mit 0 durch eine solche Curve a E : 

x = <p e (t) = ip(t) + £Tp e (t) + (e) 2 , 

y = M) = i>(t) + £ip e {t) + (e) 2 




Investigations in the calculus of variations 



103 



by x = Tp( A), y = ip(X) that makes n— 1 order contact with the first one a at 
point 3 and is also continuous in the vicinity of that point up to the derivatives 
of ?zth order. Then, by (30), it is possible to specify a function A = A (t) that, 
together with arbitrarily many derivatives, is continuous, or, what amounts to 
the same thing, a system of values A-'d = A 0^(t 3 ) (/z = 0, 1, . . . n — 1; A' > 0) 
such that for t = t 3 

xf = Tp^ (t 3 ) = D»Tp{ A) = tpM (t 3 ) = x™ 

y ( 3 ] = = D^tp(X) = ipM(t 3 ) = y { 3 ] 

(n = 0, l,...n- 1) , 



if we set 



<fi{ A) = ¥>[A(f)] = , ip{\) = ip[X(t)] = V’i(t) (61) 



where the functions tp lf too, are regular in the vicinity of f = It is 
always possible to take these functions Tp x , ip 1 , and hence the curve a, so that 
not only the conditions (60) are satisfied, but also so that the derivatives 
with respect to the argument t, 

p[ n) (t 3 ) = x ( 3 n) , ipi n) {t 3 ) = y ( 3 l) (60a) 



assume arbitrarily prescribed values different from ip( n \t 3 ) = x ^\^ n) (t 3 ) = 



(n 

2/3 




Let 4 be a point near 3 back on the curve a with the coordinates 
x = x A =tp 1 {t 3 - e) = y [A(f 3 - £•)] = ^(A - l) 
y = 2/4 = ipiih -e) = ip [A(t 3 - e)] = tp{ A - z) , 

where 

i = X(t 3 ) — A (t 3 — e) = A ' {t 3 )e + (5)2 — X’e + (5)2 , 
that is connected with 0 by a curve a e given by 

x = ip e (t) = ip(t) + eTp e (t) + {e) 2 , 
y = = ip(t) + eip £ (t) + (, s) 2 




104 Zermelo 1894 



verbunden, welche a in 0 und a in 4 von n— 1 ter Ordnung beriihrt, wahrend ip e 
und ip e im ganzen Intervall 0 4 denselben Stetigkeitsbedingungen wie ip und ip 
50 geniigen und sich fiir unendlich | kleine e auf diese Functionen selbst redu- 
cieren. Dann rniissen nach (30) fiir passende Functionen i?o (t) = t + To(t), 
$ 3 (f) = t + r 3 (t) und fiir die Stellen t = to, t = t 3 die Gleichungen bestehen: 

D»<p e {tio) = V M (t 0 ) DVipsi# 3 ) = -e) 

D*iP e (& o) = ip W (t 0 ) D»ip e (# 3 ) = (*3 - e) 

(M = 0, l,...n-l) . 

Aus den ersten dieser Gleichungen (fiir p = 0) ergeben sich bei hinreichend 
kleinem e die Grossen i9q und id 3 eindeutig und beliebig wenig verschieden 
von to und 1 3 : 



$0 = t Q + T 0 = t 0 + £T 0 + (e) 2 , r &3=t 3 +T3=t 3 + £T 3 + (e) 2 , 



wenn nicht etwa die Curve a in 0 oder 3 sich selbst durchschneidet , ein Fall, 
der von der Betrachtung ausgeschlossen werden moge. Wenn man namlich 
zur Grenze e = 0 iibergeht, so wird nach (60) und (63) 

lim<p e (i? 0 ) = lim^(i? 0 ) = <p(to) 
lim^ e (i? 0 ) = lim^(i?o) = ip(t 0 ) 

limc£ e (i? 3 ) = limy>(i? 3 ) = Tp x (t 3) = <p(t 3 ) 

limi/’ e (^ 3 ) = lim ip(id 3 ) = ip 1 (t 3 ) = ip(t 3 ) , 



und daher (nach der eben gemachten Voraussetzung): lim i9o = to, lim id 3 = f 3 . 
Da nun auch: lim ■*(■$()) = linnp^^o) = <P^(^o) , 

lim ip^ (id 0 ) = ip ^ (to) , lim (id 3 ) = ip ^ (t 3 ) u. s. w. 

und mindestens eine der Grossen ip' (to), ip' (to) und eine der ip' (t 3 ), ip' (t 3), fiir 
kleine e also auch eine der <p' e (id 0), V’e('^o) und eine der p' £ (ido), ipe($ 0 ) nicht 
verschwindet, so liefern die iibrigen Gleichungen (63) (p = 1,2, ...n — 1) 
mit Hilfe von (2) durch successive, lineare Auflosung fiir die Unbekannten 

(*> 0 ° = # ( o M) (t 0 ), 4 M) = 4 M) (t 3 )) eindeutig 
bestimmte Werte von der Beschaffenheit, dass auch die absoluten Betrage 
der Grossen: 



51 



_(M) _ .qM - _ Au)( + \ 

T o ~ Vo e /u 1 — r o (hi) > 



T (r) = _ e 



1 



= rjr\t a ) 



(p = 1,2 ,...n- 1) 



mit £ gleichzeitig unendlich klein werden, also schliesslich: 

r o (Al) = + (e) 2 , = £T^ } + (e) 2 

(p = 0 ,l,...n-l). 




Investigations in the calculus of variations 



105 



which makes contact of the n — 1 th order with a at 0 and with a at 4, 
while <p e and ip s satisfy the same continuity conditions on the entire inter- 
val 0 4 as the functions p and 'ip to which they are reduced for infinitely 
small e. Then, by (30), the following equations must hold for suitably chosen 
functions ’doit) = t + To{t), $3 (t) = t + 73(f) and for positions t = t 0 , t = to'- 

D^ipei'&o) = D^p^do) =^ ) (t 3 -e) 

, , —( u ) (63) 

D ti ip e {d 0) = ip M (to) D»ip e {d 3 ) = ip[ {t 3 - e) 

(f 4 = 0, 1 ) , . . n — 1) . 

Given a sufficiently small s, the first of these equations (for p = 0) yields 
the quantities do and $3 uniquely and of arbitrarily small difference from to 
and to respectively: 



$0 = to + To = to + £T 0 + OO2 , $3 = £3 + To = to + £T 3 + (e ) 2 , 



unless, say, the curve a intersects itself at 0 or 3, a case which shall be 
excluded from consideration. For if we take the limit e = 0, we then have, 
by (60) and (63) , 

lim</? £ ($o) = lim<p($ 0 ) = p{t-o) 
lim^ £ ($ 0 ) = limV^o) = ip{to) 
lim^ £ ($ 3 ) = lim(p(rf 3 ) = p^to) = p{t 3 ) 
limip £ {do) = \imip{d 3 ) = V’lfe) = ip(t 3) , 



and we therefore have (by the assumption just made) lirnifo = feu lim d 3 = t 3 - 
Since now also: lim<p £ M ^($ 0 ) = lim<p^(-d 0 ) = to ), 

lim ip ^ {do) = ip ^ {to) , lim p^ {do) = ip ^ {to) e. t. c. 



and at least one of the quantities ip' {to), ip' {to) and one of the p'{t 3 ), ip' {to), 
and hence, for small e, also one of the p' e {do), ip' e {d 0) and one of the p' e {do), 
ip' e f $0) do not vanish, the successive linear solution of the remaining equa- 
tions (63) {p = 1,2,... n— 1) in the unknowns d' 0 , d'o , . . . rfg” ~~ ^ , d 3 , . . . $3" ~~ ^ 
{d^ = dlf\to) , d^ = d^\t 3 )) yields, by use of (2), uniquely determined 
values such that also the absolute values of the quantities 



r (r) _ An) 

n — Un 



— e 



1 



__ Jr) 



{to) 



r (n) — An) 

O l/q 



_ An-) 



- e„ 1 = t. 



fa) 



{p = 1,2, .. .n- 1) 



become infinitely small together with e simultaneously, and hence finally: 




106 Zermelo 1894 



Nun kann man eine einzige Function: 

r = r(t) = r(t; e) = ef(t) + (s) 2 

bestimmen, welche im Intervall to = t = t 3 mit beliebig vielen Ableitungen 
stetig ist, fiir e = 0 sich auf 0 reduciert und den Bedingungen geniigt: 

=t^ , >)(t3) = r^ } (64) 

(M = 0, 1, ...n— 1) , 



z.B. 

n — 1 

T = { r 0 l)fc i( i ; i 0,t3) +Tg l) A: i (t;t3,t 0 )| , 

2 = 0 

wo die ganze Function 

h{t) = ki(t-,t 0 ,t 3 ) = ( t-t 3 ) n Xi(t ) 



nur den Bedingungen zu geniigen braucht: 



*i M) (to) = e M ,» , fc- M) (t 3 ) = 0 

(M = 0, l,...n-l) . 

Dies aber kann nach den zu (50) gemachten Bemerkungen durch entspre- 
chende Bestimmung der X^(t o) immer erreicht werden, namlich durch 



(t - t 0 ) i 






t = to 



xFftO = ^ 

da dann in der That: 

^(* 0 ) = D* l(t-t 3 r X i(t)] t0 =D» 

| wird, wir erhalten also: 



(/x = 0, 1, . . . n — 1) , 



(t — to) 4 ' 



“-/2, 2 



*0 



ki(t ; t 0 , t 3 ) = (t - t 3 ) n V — — -CF 1 
(i = 0, 1, . . .n — 1) , 

ganze Functionen von t und unablrangig von e, sodass der aus ihnen in der 
angegebenen Weise gebildete Ausdruck fiir r in der That alien gestellten 
Forderungen geniigt. 



(t - t 0 y 
*!(* — 1 3 )" 




Investigations in the calculus of variations 



107 



It is now possible to determine a unique function 
r = r(t) = r(i; e) = £r(f) + (e) 2 

that, together with arbitrarily many derivatives, is continuous on the interval 
to = t 5= t 3 that is reduced to 0 when £ = 0 and that satisfies the conditions 

T M (t 0 ) = T^, T M(t 3 )=T^ (64) 

(/z = 0, 1)| 

e.g., 

n — 1 

r = ^ |r 0 (i) /ci(t;to,t 3 ) + T 3 (i) fci(f;f 3 ,fo)} , 

2 = 0 

where the entire function 



= k{t\t 0 ,t 3 ) = {t~t 3 ) n Xi{t) 

only needs to satisfy the conditions 

k^Hto) = e„,j , fc[' i) (t 3 )=0 

(/z = 0,l,...n-l) . 

But, according to the remarks made in connection with (50), this can 
always be attained by means of an appropriate determination of the X^(t o)> 
namely by means of 

Xi^ito) = (M = 0,1,... n- 1) , 

L'd(t — i 3 ) J t = to 

for then, in fact, 

k^\to) = D* [(t t 3 r Xi (t )] t0 = D* 

L *■ -l<o 

and hence we obtain 



ki (^5 t()i ^ 3 ) 



(* = 0, 1, . . . n — 1) , 



(t-toY ' 
\ t0 



entire functions of f independent of £ so that the expression for r formed from 
them according to the manner specified really meets all requirements posed 
here. 




108 



Zermelo 1894 



Mit Hilfe von (64) lassen sich jetzt die Gleichungen (63) in der Form 
schreiben: 



D»cp £ (t + T) = ipM(to) 
D^ s {t + t) = ^(t 0 ) 
D^y e {t + t) = Tp^ ] (t 3 - e) 
D^tp e (t + t) = ip[^(t 3 - e) 



{t = t 0 ) 
(t = h) 



53 



t~ 1 \t 0 ) = 0 






(t 3 ) = -4 M) = - 






i, n ^ 



V ilx 1] {t 3 ) = - y ^ - e^n ( 

(/r = 1,2, ...n) . 



~(™) _ r («) 
•*■3 x 3 

— (n) ( n ) 

y 3 -y 3 



(65) 



(M = 0, 1, • • .n — 1) , 

und wenn man folgende gestattete Darstellung der Curve a e einfiihrt: 

x = <p £ (t + t) = <f(t) + £(i; e) = ip(t) + e£(t) + (e ) 2 
V = ipe(t + t) = ip(t) + T]{t; e) = ip(t) + £T)(t) + (e ) 2 
(to^t^ t 3 ) , 

mit Benutzung von (60) auch folgendermaassen: 

£ (/i) (*o;£) = 0 
y ( ^{t 0 -,£) = 0 

& ] (t 3 ;£) = Tpf\t 3 - e) - ip M {t 3 ) = w[^{t 3 -e)- tpf\t 3 ) 

= -£<Pi 1 + 1) (t 3 ) + (e) 2 

V M (h;e) = -e)~ ip itl \t 3 ) = ip { 3 \t 3 - e) - ip[^(t 3 ) 

= -£^ + 1 \t 3 ) + (e) 2 

(H = 0,l,...n- 1) , 

| schliesslich aber, durch Vergleichung der Coefficienten von e: 

rj^~ 1 \t 0 ) = 0, 



(66) 



(67) 



(68) 



Den Gleichungen (67) kann durch passende Functionen £(i;e) und ??(t;e) 
immer geniigt werden, nach den Betrachtungen von (64) z. B. durch: 



n — 1 



£{t,e) = Y & ) ( t 3,^)ki J ,(t;t 3 ,to) 



fi — 0 

n — 1 



T){t,£) = Y r n {lJ ' ) (t 3 ,£)k^(t;t 3 ,t 0 ) 

n = 0 




Investigations in the calculus of variations 



109 



By means of (64), we may now write the equations (63) in the form 

D^ e (t + T) = ipM(t 0 ) 

D»ip £ {t + r) =tpM (t 0 ) 

D»ipe(t + t) = ^(h - e) 

D^ip e (t + t ) = (t 3 - e) 

(/z = 0,l,...n-l) 

and, by introducing the following admissible representation of the curve a e : 

x = <p e (t + r) = ip(t) + £(t; e) = <p(t) + e£(t) + (e) 2 ^ 

V = i>e(t + t) = ip(t) + rj(t\ e ) = ip(t) + erj(t) + (e) 2 
(to S t S to) , 



(t = t 0 ) 



(t = t 3 ) 



(65) 



by use of (60), also as follows: 

t [fl) (t 0 ;£) = o 

V M (t 0 ;e) = 0 

? (Al) (*35 e) = (to -e)- (to) = (t 3 - e) - (t 3 ) 

= -e^ +1 \to) + (e) 2 (67) 

’? (At) (to', e) = ip[^ (to ~ e) - (* 3 ) = (h - e) - ^ (* 3 ) 

= -eVh'^fe) + (e)2 
(M = 0 ,l,...n-1 ) , 
and finally, by comparison of the coefficients of e: 

^~ 1) (to) = 0, fj^- 1 \t 0 )= 0 , 

^fe) = -4^ = -4 M) - ( 4 n) - 4 n) ) , (68) 

1} (*3) = -4 M) = -Vo^ - e v,n ( y o n) - y ( o n) ) 

(fi = 1 , 2 , ...n) . 

It is always possible to satisfy the equations (67) by means of suitable func- 
tions £(t;e) and Tj(t\e)\ for example, according to the considerations of (64), 

n — 1 

€(t,e) = ^2 & ) (to,e)k tl (t-,t 3 ,t 0 ) 

/i — 0 

n — 1 

v(t,s) = ^2 V M (t 3 ,s)k tl (t-,t 3 ,t 0 ) 

/I — 0 




110 Zermelo 1894 



und dann liefert der Ansatz: 



x = = <p e (t.) , 

V = ip{t) +r){t,e) = ip e (t) 

gemass (66) (fur r = 0) jedenfalls eine Curve a e von der verlangten Be- 
schaffenheit, womit die Ausfiihrbarkeit der vorausgesetzten Constructionen 
erwiesen ist. 

Jetzt werden die fiber 0 3 und 0 4 erstreckten Integrate : 



Jo 3= f F dt 



to 



und nach (66): 

t3 



J 0 4 = J F (V M) (f) + e£ (Al) (i) + (e ) 2 , 4E t ) + £?? (/i) (*) + (^ 2 ) dt (69) 



*0 



— J03 + sSJc 13 + (5)2 ■ 
Hier ist nach (41): 



54 



dt 



I SJ m = (X^ ] + Y^) 

to V = 0 

(xM = (t), ( t ), ^ u. s. w.) 

oder, da unter den gemachten Voraussetzungen die Umformung (42) immer 
zulassig ist: 



5Jq3 = 



I *3 *3 



E ( P mE x) + Q ^ - 1] ) + f(PZ + On) 

^ =1 E to 



dt . 



Da a der Differentialgleichung des Problems geniigen soil, so verschwindet 
wegen P = 0 und Q = 0 das zweite Glied, und es wird nun wegen (68): 



dJo3 = — 



n 

E (-Pm03)4 M) +Qn(t3)V3' ) 



M = 1 



n 

E + q^ ( a1) 



-Pr 



M= 1 



-I t = t 3 



,(t 3 ) (4 n) -4 n) ) 



Qn(*3) (: 



— (n) (n) 

2/3 — 2/3 




Investigations in the calculus of variations 



111 



and the ansatz 

x = ip{t)+£,(t,e) = ip E {t) , 

y = i’(t) + v(t,e) = i>e{i) , 

in accordance with (66) (when r = 0), then certainly furnishes a curve a e 
constituted as required, whereby we have shown that it is possible to carry 
out the assumed constructions. 

The integrals taken along 0 3 and 0 4 now become 

*3 

J 0 3 = y F dt 

to 

and, by (66), 

ts 

j 0 4 = J F (t) + £ (t) + (e) 2 , ip^ it) + e?? (Al) it) + (e) 2 ) dt (69) 

to 

= Jo3 + sSJo 3 + (s) 2 • 

Here, by (41), 

*3 n 

SJ 03 = / E (E? (M) + ^ (m) ) dt 

= = ^>(i),X M = u.s.w.) 

or, since, by our assumptions, the transformation (42) is always admissible, 

' n "I *3 *3 

SJ 03 = E ( P ^ (M ’ ■ 1] ) + / ( p ? + On) dt ■ 

L^ = 1 -1*0 to 

Since a is supposed to satisfy the differential equation of the problem, the 
second term vanishes on account of P = 0 and Q = 0, and now, on account 
of (68), 

n 

dJ 03 = - E +Qu(t 3 )^) 

M = 1 
n 

= - E [ P M x(/i) + - -Pnfe) (4 n) - 4°) 

L J t — t3 \ / 

11 = 1 




112 Zermelo 1894 



oder, mit Benutzung von (18) i und (8): 

SJos = -F(h) - X n (t 3 ) (4 n) - 4 n) ) - Yn{ts) ( vi n) - yi n) ) ■ (70) 

Das liber 4 3 erstreckte Integral aber wird: 

*3 

J 43 = J (71) 

*3 — e 

gemass der Form (61) fiir a, oder 

J 43 = eF ^ M) (t 3 ), + (e)2 

= eF +( £ )2 , 

55 | so dass schliesslich wegen (69), (70) und (71): 



<7o4 — >7o3 + — ^043 — >7o3 (72) 

= eE + ( £ )2 . 

oder nach (62) 

= ^e (4" ) )y3' i) ;*3 n) > j/ 3 ^) + (O 2 , 

wenn gesetzt wird: 

E(x^\y^ ) -,x (n \y {n) ) =F-F-X n (x^-x^)-Y n (y^~y^) (73) 

= F(^>,l( W )-F(xW,i/ w ) - ®W) 

-Y n (x^,y^)(y^ -y^) , 

wo nach (62): 

=X M , yM=yM (/i = 0, 1, . . .71 — 1) , 



also: 

F = F(x^\y^ 

= F^x,x',...x^- 1 \x^; y,y',...y^- l \y^) . 

Lasst man hier die Argumente a;^, (y = 0, 1, . . . n — 1) iiberall fort 
und ersetzt: 




Investigations in the calculus of variations 



113 



or, by use of (18) i and (8), 

SJ 03 = -F(ts) - x n (t 3 ) (4 n) - 4 n) ) - y„(t 3 ) ( vi n) - 4 n) ) . (70) 

The integral taken along 4 3, however, becomes 

^3 

J 43 = J dt (71) 

t3 — e 

in accordance with the form (61) for a, or 

J43 = eF (t 3 )) + (e ) 2 

= eF (44 + (s) 2 , 

so that, finally, on account of (69), (70) and (71), 

J04 — J03 + J43 = < 7 o 43 — J03 ( 72 ) 

= eE (44444444) + (e ) 2 , 

or, by (62), 

= j; E (44444444) + (O2 , 

if setting 

£(z ( 444z ( 444 = F - F - X n (x {n) - x M ) -Y n (y^ - y {n) ) (73) 
= F(x^\y^) -F(xM,yM) - X n (x M ,y M )(x^ n) -*<">) 

44(4" } -44 , 



where, by (62), 

*(**) = a; ( ' l) , t/ (M) = 4 m) 



(A* = 0, 1, ... n — 1) , 



and hence 

F = F(z W ,y (#i) ) 

= f(z, a/,... y, y ', . . . 4"“ 1} , y< n) ) • 

If we now drop the arguments x^\ (y = 0, 1, . . . n — 1) everywhere 
and replace 




114 Zermelo 1894 



der Reihe nach durch 



P, q\ P, Q, 



so kann man abgekiirzt schreiben: 

E(p,q;p,q)= (73a) 

F(p,q) - F{p,q) - dF<y g p q ^ (p - p) - dF<y g q q - (<? - q) ■ 

Die Gleichung (72) entspricht genau der von Herrn Prof. Weierstrass fur 
den Fall n = 1 abgeleiteten, ebenso auch die Formel (73) oder (73a) einem 
der von ihm fiir die Function E aufgestellten Ausdrucke. 

56 | Nach (60a) ist hier mit Hilfe von (2): 

z (n) = <p[ n \t 3 ) = D n Tp{\) = R n (rp^\ A), A<**>) 
y {n) = tfV) = D n ^( A) = R n (^ (m) (A), \M) , 



also abhangig von den Grossen A, A', . . . X^ n \ von denen durch (60) nur die 
ersten A, A', . . . A^™ ~ bestimmt werden, wahrend A^ noch ganzlich willkiir- 
lich bleibt. Da aber E seiner geometrischen Bedeutung wegen einen bestimm- 
ten Wert besitzen muss, so wird es nur scheinbar von A^ abhangen konnen 
und seine nach A^”^ genommene Ableitung identisch verschwinden miissen. 

In der That wird nach (2) mit Hilfe von (60) 



also nach (73): 



dx (n) 

0A(") 

dyW 

dXM 



¥( A) 



X' 

A' 



A' ’ 

t 

X' ’ 



dE dF dx dF dy < n ) dF 

dXW ~ dX ( n ) 9A(") dX ( n ) dy^ 



weil nach (17)„: 



(74) 



dF dx (n) dF dy {n) _ 1 ( dF , dF A _ 

dx ( n ) <9A( n ) dy( n ) <9A( n ) A' \3aA n ) + dy( n ^ J 



und: 



OF dF dx {n) OF dy (n) 1 / dF , dF \ 

~Q}Jn) - fain) Q\(n) + Qy(n) Q X (n) ~ + Qyin) V J ~ 




Investigations in the calculus of variations 



115 



successively by 



p, q; Pi q , 



we then may use the abbreviation 

E(p,q-,p,q)= (73a) 

F(p,q) - F(p,q) - dF ^ ^ (p - p ) - dF ^ ^ (q - q) . 

The equation (72) precisely corresponds to the one derived by Prof. Wei- 
erstrass for the case n = 1, as does the formula (73) or (73a) to an expression 
used by him for the function E. 

By (60a), we now have by means of (2) 

x {n) = # } (t3) = F) n Tp(X) = R n (rp^( A), A^) 
y {n) = = D n ^{ A) = R n (^ (ai) (A), A^) , 



and it thus depends on the quantities A, A', . . . \^ n \ of which, by (60), only 
the first A, A', . . . X( n ~ 1 \ are determined, whereas A ( n ' ) is still entirely arbi- 
trary. But since E must have a determinate value on account of its geometric 
meaning, it can only appear to depend on A^ n -* and its derivative with respect 
to A ^ must vanish identically. 

Indeed, by (2), we have by means of (60) 



dx {n) 

d\( n l 

d y (n) 

0A(") 






<p'(h) 

A' 

ip'ih) 

X' 



A' ’ 
X' ’ 



and hence, by (73), 

dE dF dx (n) dF dy^ n) dF 

dX ( n ) _ dX ( n ) dX ( n ) dxW dX^ dyW 



(74) 



since, by (17)„, 

dF dx (n) dF dy (n) _ 1 f dF , dF A _ 
dx( n ) <9A( n ) dy ( n ) d\( n ) A' + dy( n ^ ) 



and 



dF dF dx (n) dF dy {n) 1 / dF , dF \ 
dXlF) ~ dx {n) 9A(”) + dy( n) dX (") ~ A 7 + dy^ n) V ) ~ 




116 Zermelo 1894 



Sei jetzt g = g(t) = g(t',e) = eg(t) + (e )2 eine fur alle Werte von t 
mit e gleichzeitig unendlich klein werdende ganze Function von t , die den 
Bedingungen geniigt: 

gM (to) = t 0 (m) , ( t 3 - e) = t 3 (m) + e^o £ 

(M = 0, 1, . • .n — 1) , 

57 wo , t 3 ' /1 ' 1 dieselben Bedeutungen wie in (64) haben; eine solche | Function 
kann nach dem Verfahren von (64) iinmer bestimmt werden, namlich 

n — 1 

9=^2 - e) + (t 3 (m) + e^oe) fc M (t;i 3 to)} ■ 

fi — 0 

Dann werden die Gleichungen (65) iibergehen in die folgenden: 

[D' l Ve{t + g)] t = t o = [D^tp £ (t + T)\ to = p M (t, 0 ) 

[D^ipeit + £»)] to = + r)] tQ = ^\t 0 ) 

[D^<p e (t + g)\ ts _ £ = [D»ip e (t + r)] t3 = Tp { f>(t 3 -e) 

[D^ £ (t + g)\ t3 _ £ = [D^ e (t + r)] t3 = ^ (t 3 - e) 

(m = 0, 1, ■ ■ ■ n — 1) . 

1st daher, wie vorausgesetzt, 



ti ^ to < ^3 = ^2 i 



so kann der zusammengesetzte Linienzug 1 0 4 3 2 ausgedriickt werden durch 
die Gleichungen: 



X = ip(t) , 


y = i>(t) 


(ti ^ t ^ t 0 ) 


X = ip e (t + g) , 


y = ip e (t + g ) 


(t 0 ^ t ^ t 3 - e) 


x = Tp^t) , 


y = V’i 0) 


(t 3 - £ ^ t ^ t 3 ) 


x = p(t) , 


y = i>(t) 


(t 3 ^ t ^ * 2 ) , 



wo die Grossen 

xM = i£-> y^ ) = d ^ (m = 0, 1, . . . n — 1) 



(76) 



wegen (60) und (75) auch an den Ubergangsstellen t = to, t = t 3 — s, t = t 3 
der Curven a, a £ und a immer stetig bleiben und wo die entsprechenden 
absoluten Betrage 



58 



x^-ip^\t) , 



yM-^)( t ) 




Investigations in the calculus of variations 



117 



Now suppose that g = g(t) = g(t;e ) = £~g(t) + ( 5)2 is an entire function 
of t that becomes infinitely small for all values of t together with s and that 
satisfies the conditions 

{to) = , g ^ (t 3 -e) = t 3 (m) + e M , 0 £ 

(/x = 0, 1, ... n — 1) , 

where Tq M \ t^ 1 ' 1 have the same meaning as in (64); a function of this kind 
can always be determined by means of the procedure of (64), namely 



Q = X] t 3 -e) + ( J 3 (m) + e M , 0 ej k^t; t 3 - e, f 0 )j ■ 

ji — 0 

The equations (65) are then transformed into the following ones: 

[D»y e {t + Q)\ t = to = [D»ipe{t + r)] to = <p M {t 0 ) 

[D^ipeit + £>)] to = [D^ e {t + r)] to = 

[D li ip £ {t+ e)] ta - £ = [ Dli <Pe{t + T)] t3 =Tp ( f\t 3 -e) 
[D»ip £ {t+ g)\ t3 _ e = [D»ip e (t + r)] t3 = ^[ fl) (t 3 -e) 

(/x = 0, l,...n-l) . 

If, as assumed, 

t\ Ik to < t 3 ^ <2 > 

then the compound line 1 0 4 3 2 can be expressed by the equations 



X = ip(t) , 


y = i>{t) 


(ii = i = *o) 


X = </5 e (f + g) , 


y = 1pe{t + g) 


(t 0 ^ t ^ f 3 - e) 


X = ^(t) , 


y = 


(t 3 - e ^ i ^ t 3 ) 


X = <p(f) , 


y = i>{t) 


{t 3 ^t^ t 2 ) , 



where the quantities 



r (M) _ 



d^x w = dTv , 

dt» ’ V dt** U U ’ I,'"” 



(76) 



on account of (60) and (75), are always continuous at the corners t = t 3 , 
t = t 3 — £, t = t 3 of the curves a, a £ and a, and where the corresponding 
absolute values 



M) - mW 



<P W {t) 



y(") 




118 Zermelo 1894 



fur alle Werte von t zwischen t\ und £2 mit e gleichzeitig unendlich klein 
werden; denn da 



0 = £Q(t) + {eh 

und 



<Pe(t) = - £<P e (t) + {e)2 , 

so wird auch: 

D^ip e (t + Q) = t ) + (e) 

und fiir t 3 — £ 'A t 'A <3 auch: 

= ^i m) (* 3) + (e) = + (e) = ¥> w (0 + (e) 

(/x = 0, 1,2, . 

Ausserdem beriihrt diese Curve 1 0 4 3 2 die erste a oder 1 0 3 2 in den 
Punkten 1 und 2 von der Ordnung n — 1, da 

a>) = <pM(ti) , yW = ^ (#i) (*i) (f = fi) 

(/r = 0, 1, . . .n - 1) , 

y („) = ^00 (t 2 ) , y(M) = ( t2 ) (t = t 2 ) 

wegen (75) auch in den Grenzfallen, wo einer der Punkte 0 oder 3 mit 1 oder 2 
zusammenfallt . 

Es kann daher 1 0 4 3 2 als eine „erlaubte“ und fiir hinreichend klei- 
ne e > 0 beliebig eng „benachbarte“ Variation von a oder 1 2 angesehen 
werden, wenn in den Definitions-Gleichungen (34) und (37) r = m = n — 1 
angenommen wird. Da nun alle Teil-Integrale unabhangig sind von der be- 
sonderen Darstellung der Curven, so kann man hier die Formel (72) benutzen 
und erhalt: 



AJ — J10432 — ^1032 — Jo 43 — Jo 3 

= eE (x^\ y^ ) ;x { 3 n \y < 3 n) ^j + (e) 2 , 

ein Ausdruck, der nach (31) wenigstens fiir hinreichend kleine e immer positiv 
sein muss, wenn clem Stiick 1 2 von a ein wirkliches Minimum des betrachte- 
ten Integrales entsprechen soil, also wegen e > 0 auch: 

E(x^\y = E (t 3 ;p,q) A 0 (77) 

unabhangig von e, aber fiir beliebige Werte der Grossen p, q. 

59 | Da nun die Punkte 0 und 3 zwischen 1 und 2 willkiirlich angenommen 

werden durften, so muss auch allgemein: 



E(t\p,q) ^ 0 




Investigations in the calculus of variations 



119 



for all values of t between t\ and t 2 become infinitely small together with e\ 
for, since 

Q = £Q{t) + (e)2 

and 

<p e {t) = ip(t) - eTp e {t) + {e) 2 , 

so also 



D^ipsit + g) = + (e) 



and when t^ — £ 51 t 5= £3 also 

4^(0 = M) (t 3 ) + (e) = p w (t 3 ) + (e) = <P M (t) + (e) 

(m = 0, 1,2, . . .n — 1) . 

In addition, this curve 1 0 4 3 2 makes contact of order n — 1 with the 
first one a or 1 0 3 2 at the points 1 and 2, since 

£(")=</?(") (ti) , j/m) = (t = ti) 

(/t = 0, 1, . . .n - 1) , 

yM=<pM(t 2 ), y^=^\t 2 ) (t = t 2 ) 

on account of (75), also in the borderline cases where one of the points 0 
and 3 coincides with 1 or 2. 

We may therefore consider 1 0 4 3 2 an “admissible” and, for sufficiently 
small e > 0, arbitrarily closely “neighboring” variation of a or 1 2, provided 
that r = m = n — 1 is assumed in definition equations (34) and (37). Now 
since all partial integrals are independent of the particular representation of 
the curves, we may use formula (72) here, thereby obtaining 

AJ = J 10432 ^ <7l032 = >7 o 43 — >7o3 

= sE (x^\y^ ) ;xi n \y ( 3 n) ^ + (e) 2 , 

an expression that, by (31), must always be positive, at least for sufficiently 
small c , if a real minimum of the integral under consideration is to correspond 
to the segment 1 2 of a, and hence also, by e > 0, 

E (4^ . -,P,q)=E (i 3 ; P, q) ^ 0 (77) 

independent of e, but for arbitrary values of the quantities p, q. 

Now since we could choose any points 0 and 3 between 1 and 2, it must, 
generally, be the case that 



E{t-,p,q ) ^ 0 




120 Zermelo 1894 



sein fur das ganze Intervall t\ < t < t-z als eine notwendige Bedingung fiir das 
Bestehen eines Minimums in dem angegebenen Sinne. 

Satz V. Soil unter der Annahme r = m = n — 1 in (34) und (37) ein 
unseren bisherigen Voraussetzunqen entsprechendes Stuck 1 2 einer Losung a 
der Differentialgleichung des Problems ein wirkliches Minimum des Integrates 
liefem, so darf die Function E(t;p,q) an keiner Stelle t des ganzen Intervalls 
1 1 . . . t 2 fur irgend welche Werte der Grossen p, q negativ werden konnen. 

Um nun das Vorzeichen der Function E zu untersuchen, wollen wir die- 
selbe einer gewissen fiir n = 1 ebenfalls schon von Herrn Prof. Weierstrass 
angegebenen Umformung unterziehen. 

Es ist nach (73a): 

dF 

E(p,q; p, q) = F(p, q ) - F(p,q)~ — (p, q) (p - p) 

dF 

- -Q^(p,q){q-q) ■ 

Setzt man hier: 



Pe = v + e(p - p) , q £ = q + e(q - q ) , 



so dass fiir £ = 0 : 
und fiir e = 1 : 
so wird: 

E{p,q;p,q ) = 



Pe = p , q s = q , 
Pe = p , q s = q, 



dF 



F(Pe,q e ) + (1 -£)(p-p)^—{Pe,qe) 

dpe 

+ (1 - e){q- q)^-(p s ,q s ) 



£ = 1 



J £ = 0 

1 

= / (1 ~ £ ){ ^(Pe^e)(P~P) 2 

d 2 F d 2 F 1 

+ 2 dpdq {Pe,qe)(p-p)(q- q) + -^(pe,qe){q - ?) 2 | de 



60 | oder, da nach (19): 

d 2 F 

dp 2 e 



d 2 F 

= y' 2 F\(p £ , q e ) ° = x ' 2 Fl (p e ,q E ), 



d 2 F 
dpedq e 



dq 2 e 

= -x'y' F-i^qe) 



1 

E(p,q;p,q) = ( y'{p~P ) - x'{q- q)) 2 J F 1 (p e ,q e )(l - e) de 



= (k~k) 2 Ei(p,q;p,q) , 



(78) 




Investigations in the calculus of variations 



121 



for the entire interval t\ < t < t%, which is a necessary condition for the 
existence of a minimum in the specified sense. 

Theorem V. If, assuming that r = m = n — 1, in (34) and (37), a 
segment 1 2 satisfying our previous requirements of a solution a of the differ- 
ential equation of the problem is supposed to furnish a real minimum of the 
integral, then the function E(t',p,q) must not be negative at any position t of 
the entire interval t\ . . . t 2 for any values of the quantities p, q. 

In order to investigate the sign of the function E we shall subject the 
function to a certain transformation which, too, has already been specified 
by Prof. Weierstrass for n = 1. 

' By (73a), 

dF 

E(p, q\ p, q) = F(p, q) - F(p,q) - — (p, q) (p - p) 

dF 

- -q^{p,q)(q-q) ■ 

If we now set 



Pe = p + £(p-p), q s = q + e(q- q) , 



so that for £ = 0 : 
and for £ = 1 : 
then 

E(p,q\p,q) = 



Pe = P , Qe=q , 
Ps=P , He = q , 



dF 



£ =1 



J £ = 0 



F(pe,q s ) + (1 - £)(p-p)-^-(p £ ,q £ ) 
+ (1 -e)(q-q)^-(p s ,q s ) 

1 

= J (! -e) ^j^(Pe,qe){p-p) 2 



d 2 F d 2 F 

+ 2 dp9q (Pe,qs){p- p){q- q) + -Q^(pe,q s ){q- q) ^ ds 



or, since, by (19), 

d 2 F 

dpi 



d 2 f 

= y f 2 F 1 (p e ,q e ) ° = x ' 2 Fl {p e ,q e ), 



d 2 F 
dpedq £ 



dq 2 

= -x'y'Fi(p E ,q s ) , 



1 

E(p,q;p,q) = (y’(p-p) ~x'(q-q)) 2 J F 1 {p e , q s )(l - e) de 



= (fc — k) 2 E\(p, q;p, q) 



(78) 




122 



Zermelo 1894 



wenn: 



y'p — x'q = k , y'p — x'q = k 

gesetzt wird und das bestimmte Integral mit E\ bezeichnet. Fiir n > 1 ist 
diese Formel immer anwendbar, so lange die nach p e und q e genommenen 
zweiten partiellen Ableitungen und daher auch F\ endlich und stetig bleiben 
(vergl. die bei (19) gemachte Bemerkung). 

Da in (78) der erste Factor immer positiv ist, so darf auch E\ im Falle 
eines Minimums immer nur ein positives Vorzeichen besitzen. 

Durch Anwendung des Mittelwertsatzes auf das Integral (78) erhalt man, 
was sich noch einfacher direkt aus (73a) durch Berechnung des Restgliedes 
der Taylorschen Entwicklung ergeben hatte: 

l 

Ei(p,q;p,q) = Fi(p„,q„) J (1 - e) de = q„) , (79) 

o 

wo k eine gewisse Grosse zwischen 0 und 1 bedeutet. 

Die Formeln (78) und (79) gestatten nun folgende Schliisse: 

1. Fiir hinreichend klein gewahlte \p — p\ und \q — q\ erhalten E\ und E 
immer dasselbe Vorzeichen wie Fi(p e ,q £ ) und somit auch wie Fi(p,q), falls 
dieser Ausdruck nicht verschwindet, wahrend gleichzeitig k — k = y’ (p — p) — 
x'(q—q) und daher auch E bei geeigneter Wahl von p und q von 0 verschieden 
angenommen werden kann. 

2. Besitzt F\(p,q) fiir irgend eine Combination der fortgelassenen Argu- 
61 mente x,x' , . . . x^ n ~ ^ ; y, y' , . . . y^ n ~ ein be- \ stimmtes, von p, q unabhan- 

giges Vorzeichen, so besteht dasselbe Vorzeichen auch fiir alle zugehorigen 
F\ (p e , q e ) und damit auch fiir E\ (p, q; p, q) und E(p, q ; p, q), unabhangig von p 
und q. 

Fiir das Stattfinden der in V verlangten Eigenschaft eines Curvenstiickes a 
ist also: 

1. notwendig, dass F\(x^\y^) ^ 0 ist auf dem 

ganzen Curvenstiick, d. h. fiir (80) 

xOd = tpW (t) , yW = (p, = 0, l,,..n; t\ ^ t ^ t 2 ) , 

2. hinreichend , dass F\ (x^\y^) ^ 0 ebenfalls 

auf dem ganzen Curvenstiick, fiir (81) 

y M =^\t) (p = 0, 1, . . .n — 1) , 

aber fiir willkurliche Werte der x ^ = p, y^ = q. 

In V ist also noch folgende notwendige Bedingung des Minimums enthal- 
ten, die, nur in anderer Form, gewohnlich aus der Betrachtung der „zweiten 
Variation 11 gewonnen wird: 




Investigations in the calculus of variations 



123 



if we set 



y'p — x'q = k , y'p — x'q = k 

and denote the definite integral by E\. This formula may always be applied 
when n > 1 as long as the second partial derivatives with respect to p e and q £ , 
and hence also F\, remain finite and continuous (cf. the observation made in 
connection with (19)). 

Since the first factor in (78) is always positive, E±, too, may always only 
have a positive sign in case of a minimum. 

Applying the mean value theorem to the integral (78), we obtain what 
we would have obtained directly and even more simply by the calculation 
from (73a) of the remainder term of the Taylor series: 

l 

Ei (p, q; p, q) = Fi (p„, q„) J(l-e)ds=^F 1 (p„, q„) , (79) 

o 

where >r denotes a certain quantity between 0 and 1. 

Formulas (78) and (79) now permit the following conclusions: 

1. If |p — p | and |g — g| are chosen sufficiently small, then E\ and E 
always get the same sign as iq (p E ,q e ), and hence also as F\{p,q), provided 
this expression does not vanish, whereas at the same time k — k = y’ (p — p) — 
x'(q — q), and hence also E, may be taken to be different from 0 for a suitable 
choice of p and q. 

2. If F\{p,q), for any combination of the omitted arguments x,x',... 
x^ n ~ 1 ); y, y' , ■ ■ ■ y^ n ~ 1 ), possesses a particular sign independent of p, q, 
then this sign is also shared by all associated Fi{p e ,q e ), and hence also by 
Ei(p, q;p, q) and E(p,q;p,q), independently of p and q. 

In order for a curve segment a to have the property required in V, it is 
therefore: 

1. necessary that Fi(x^ ,y^)) ^ 0 on the 

entire curve segment, i.e., for (80) 

x^E) = (f) ) y (u) — (// = 0, 1, . . .n; ti ^ t ^ t 2 ) , 

2. sufficient that F\ (x^\y(F)^ ^ o also 

on the entire curve segment, for (81) 

x M= ip M(t), (/z = 0, 1, . . . n — 1) , 

but for arbitrary values of the x ^ = p, y ^ = q. 

Hence, V also contains the following necessary condition for the mini- 
mum, which, albeit in a different form, is usually obtained by considering the 
“second variation”: 




124 Zermelo 1894 



62 



Satz VI. Wenn a ein wirkliches Minimum liefern soil, so darf in der 
ganzen Ausdehnung des Curvenstuckes die Function F\, oder was nach (19) 

d 2 F d 2 F 

oder „ , aus den Ableitungen x^\ 



dasselbe bedeutet, die Function 



Q%(n) 2 dy( n ) 2 

der Coordinaten von a selbst. gebildet, nirgendwo negative Werte anneh- 

men. 

Hier ist der Beweis nur gefiihrt fur r = m = n — 1 in (34) und (37); 
doch lasst er sich unmittelbar ribertragen auf m = n, da, wie man sich 
leicht iiberzeugt, durch Verkleinerung von \p — p\ = (t, 3 ) — tp( n \t 3 )\ 

und \q — q\ = in (76) auch iiberall \x — (p( n \t) | und 

| y( ra ) — beliebig klein gemacht werden konnen. Nur der Satz V, wel- 

cher die unbeschrankt willkurliche Wahl der p, q voraussetzt, muss dann seine 
Giiltigkeit verlieren, wahrend VI gerade durch Betrachtung kleiner |p — p|, 
| q — q | abgeleitet war. 

Fiir den Fall n = 1 geht wegen 



dF dF 

P-g^{P,Q) + Q~g^{P, Q) = F(p,q) 



die Formel (73a) fiber in: 



dF , 



(p = x',q = y') 



dF 



( 17 ), 



E(p,q-,p,q) = F(p, q) -p-^(p,q) - q~7^(p,q) 



und es wird infolge der vorausgesetzten Homogeneitat (17)„ von F(p,q), al- 
so auch von F(p,q): F(—p,—q ) = — F(p,q ), ebenso auch E(p, q\ —p, —q) = 
— E(p, q;p,q ), wenn F als eine eindeutige analytische, z.B. als eine rationa- 
le Function von p und q angenommen wird. Es kann demnach E{jp, q;p,q), 
wenn es nicht etwa fiir beliebige p, q immer verschwinden sollte, fiir passend 
gewahlte Werte dieser Grossen beliebig auch negativ gemacht werden, und 
ein Minimum in dem angegebenen Sinne (m = r = n — 1 = 0) ist unmoglich. 

Dieser von Herrn Prof. Weierstrass in der dargestellten Weise allgemein 
bewiesene Satz lasst sich aber auf den Fall n > 1 nicht ribertragen. 

Auch fiir rationale Functionen F(x^\y^^ kann die Function E ein 
von p , q unabhdngiges Vorzeichen und das zugehorige Integral in einer Nach- 
barschaft. von der Ordnung m ^ n — 1 ein wirkliches Minimum besitzen, 
wenn n > 1 ist. 

Dies soil zunachst durch ein Beispiel gezeigt werden. 

Es sei namlich: 



F 






M ) _„2 



= Vn x 




nach (20) eine gestattete Annahme. Hier kann, wie man sich durch vollstan- 
dige Induction leicht iiberzeugt, geschrieben werden: 



d n y x'y - y'x^> 
dx™ ~ x' n + 1 



(xM,y^) n _ 



(82) 




Investigations in the calculus of variations 



125 



Theorem VI. For a to furnish a real minimum, the function Fi , or 

d 2 F d 2 F 

what, by (19), amounts to the same thing, the function ^ or g („) 2 > 

formed from the derivatives , y^F) of the coordinates of a itself, must 
never assume negative values in the entire extension of the curve segment. 

We only provide the proof for r = m = n — 1 in (34) and (37); but 
it can be immediately extended to m = n, since, as can be readily seen, we 
can make — ip^ n \t) | and \y^ — if( n \t) | everywhere arbitrarily small by 

decreasing \p — p\ = — ip^ n \t 3 )\ and \q — q\ = — ip^ n \ts) 

in (76). Only Theorem V, which presupposes the entirely arbitrary choice of 
the p, q, must then lose its validity, whereas VI had been derived precisely 
by considering small \p — p\, \q — q\. 

When n=l, on account of 



dF dF 

PffytP’ 9) + q~g^(,P,q) = F(p,q) 



(p = x ',q = y') , 



the formula (73a) is transformed into 



dF _dF 

E{p , q; p , q) = F(p, q) -p-g^(p,q)- «) 



and, by dint of the assumed homogeneity (17) n of F(p,q), and hence also 
of F(p,q), we have F(—p,—q) = — F(p,q ), and also E(p, q; —p, —q) = 
— E(p, q;p,q ), provided that F is a single- valued, analytic, e.g. a rational, 
function of p and q. Thus, unless E(p, q; p, q) always vanishes for arbitrary p, q, 
we can also make it negative for suitably chosen values of these quantities, 
and a minimum in the sense specified (m = r = n— 1 = 0) is impossible. 

It is, however, not possible to extend this theorem, which Prof. Weier- 
strass proved for the general case in the way presented here, to the case n > 1. 

The function E may possess a sign independent of p, q also for ratio- 
nal functions F (x^\y(^, and the associated integral a real minimum in a 
neighborhood of order m^n— 1 when n > 1. 

At first, this will be demonstrated by an example. 

For suppose that 



F ^x M ,y (/i)S j = y 2 x' 




which, by (20), is a permissible assumption. In this case, as can be readily 
shown by mathematical induction, we can write 



d n y x'yW - y’x^ 
~ x m+ 1 




(82) 




126 



Zermelo 1894 



wo der Klammerausdruck die Grossen p = x ^ und q = y ^ nicht mehr 
enthalt. Daher ist: 



dF 

dp 



O 9y n , 

2y -i^ x 



v 



= v '2 Fl = V v 

dp 2 y 1 x ,2n + 2 ’ 

2 



63 | und nach (78): 



E(p,q;p,q) 



( k ~ k f J ^ +r( 1 - e ) de = 

0 



x'(p-p)-y'(q-q) 

T fn + 1 



(/c — fc ) 2 

^./2n + 1 



ein Ausdruck, der immer ein positives Vorzeichen hat fiir x' A 0, unabhangig 
von p, q. 

In der That entspricht auch einer Parabel 2 n ter Ordnung: 

x = ip(t)=t, y = ip{t) = a 0 + a\t + . . .a 2n - it 2 " -1 (83) 



fiir m < n ein wirkliches Minimum des zwischen beliebigen Grenzen t\ und t 2 
erstreckten Integrals: J = / y^x' dt,. 

Fiir alle von einer Ordnung to ^ 1 der Parabel benachbarten Curven a 
x = Tp{\) , y = V>(A) 



kann nach (37) immer 



|p'(A)A' - y'{n)\ = |p'(A)A' - 1| < 1 (A' > 0) , 



also 



x' = ip'{ \)\' > 0 , <//(A) > 0 

angenommen und daher auch immer eine Darstellung fiir a von der Form: 
x = t = x , y = Ip(t) = ip(t) + 7] (t) = y + rj , 



(84) 




Investigations in the calculus of variations 



127 



where the bracketed expression no longer contains the quantities p = x ^ 
and q = y^ n \ Hence, the expression: 



dF 

dp 



2 V 

Vn dp 



2 y' 

3F+T V pX 



= v '2 Fl = 2 y ,2x ' 

dp 2 y 1 x’ 2n + 2 ’ 

2 

=Fl(Pe,qe) 



and, by (78): 



E(P, F,Pi q) 



( k ~ k ) 2 J ^+ i ( 1 ~ £ ) d£ = 

0 



x'(p-p) -y’(q- q) 



r fn + 1 



(k - k ) 2 
1 



always has a positive sign for x’ ^ 0 independent of p, q. 

In fact, also to a parabola of 2n — 1 t.h order 

x = tp(t)=tj y = ip(t) = a 0 + a\t + . . . a 2n - \t 2n ~ 1 (83) 

for to < n there corresponds a reed minimum of the integral J = / yfpx' dt 
taken between arbitrary boundaries ii and t 2 - 

For all curves a neighboring the parabola of order to ^ 1 and given by 

x = Tp{\) , y = ijj(X) , 

we may always assume, by (37), 

|^(A)A' - /(*)! = |^(A)A' - 1| < 1 (A' > 0) , 



and hence 



x' = tp'( X)X' > 0 , y/(A) > 0 , 

and thus always use a representation for a of form 

x = t = x , y = ip(t) = ip(t) + 77(f) = y + 77 , 



(84) 




128 



Zermelo 1894 



wo ip(t) und ebenso ?y(i) eine eindeutige Function bedeutet, zu Grunde gelegt 
werden, so dass: 

rfnj: 

x' = 1 , y n = — = y (n) = y (n) + ?y (n) und 
*2 *2 

J = J(a) = J y 2 n x' dt = J ( y (n) + ?y (n) ) dt 

ti t\ 

t2 t2 t2 

= J y {n)2 dt + 2 J y^y^ dt + J rf- n ? dt . (85) 

t\ t i t\ 



64 | Nun ergiebt sich, wenn in (34) gemass IV r ^ n — 1 angenommen wird, 

also y und demnach ?y als eine mit ihren n — 1 ersten Ableitungen stetige 
Function von t im ganzen Intervall, durch partielle Integration: 

*2 

J y (n) ry (n) dt = 

t2 

+ J (~l) n y {2n) ydt = 0 . 

£l 

(86) 

Hier verschwindet der erste Teil, weil nach (32) die beiden Curven in den 
Endpunkten t± und G einander von n — 1 ter Ordnung beriihren, also 

= v i,l) - y (m) = 2 /„ - y» = o (y = o, 1, . . . n - 1) , 

und der zweite Teil wegen: 

j2n 

y f '~ n ' 1 = jj-2n ( a ° + 0,1 ^ + • • ' + a 2n - 1 ^" 1 ) = 0 

fiir alle Werte von t. Es wird also nach (85) und (86): 

t2 t2 £2 

J ~ J = J Vldt- J yl dt = j ?y (n)2 dt > 0 . 

t\ t\ t\ 

Verschwinden konnte die Differenz nur fiir = 0 (ti ^ t ^ 1 2 ), dies 
aber ergabe wegen jy^(ii) = 0 (y = 0, 1, . . . n — 1) durch nfache Integration 
oder nach dem Taylorschen Satze: 



„.(,ri)(n—l) n (n + — 2) , , / -i \n — l rti (2n — 1)„ 

y^ by 1 - —y y '+... + (—1) y K by 



ry = 0 , 



y = y (ti ^ t ^ t 2 ) , 




Investigations in the calculus of variations 



129 



where ip(t) as well as 77(f) denotes a single-valued function, such that 

x =1 , y n = 
t2 

J = J(a) = J 

tl 



d n y 

dx n 


= 


5* 

II 

J 


i n ) _|_ yin) 


and 






t2 








dt 


= j 


n) +7 )in)y 


dt 






tl 










t2 

/* 




t 2 


yin) 2 


dt 


+ 2 y ( 


n ty n) dt + 


/ V 






J 

tl 




J 

tl 



Let us now assume that r ^ n — 1 in (34), in accordance with IV, and 
hence that y , and thus 77, is a function of t which, together with its first n — 1 
derivatives, is continuous on the entire interval. By partial integration, we 
then obtain 

*2 

J y^rf n) dt = 

tl 

t2 

+ J(-l) n y( 2n) ridt = 0 . 

tl 

(86) 



yi n )yi n ~ b _ yi n +l)yin- 2) 



(-I)"" V 2n -i)»? 



In this case, the first part vanishes since, by (32), the two curves make 
contact of n — 1th order with one another at the endpoints t\ and t?, and 
hence 



v (u) = y(n) _ y(ri = 7 /^ - = 0 (/X = 0, 1, . . . n - 1) , 

and the second part, on account of 

J2ra 

y < ' 2n ^ > = ( a ° ~ ttl t + • • ■ + a 2n- 1 t 2n *) = 0 , 

for all values of t. By (85) and (86), we therefore have 

t2 t2 t2 

J - J = j y 2 n dt - j y 2 dt = j y {n)2 dt > 0 . 

1 1 1 1 1 1 

The difference could only vanish when ?/") = 0 (ti ^ t ^ t^)- But then 
we would obtain, on account of = 0 (/z = 0, 1, . . . n — 1), by n-fold 

integration or by Taylor’s theorem 



V = 0 , 



y = y (ti ^ t ^ t 2 ) , 




130 



Zermelo 1894 



d. h. beide Curven Helen zusammen. Also: 

Das liber ein Stiick einer Parabel 2n — 1 ter Ordnung erstreckte Integral 
t2 

f yf dt ist immer kleiner als dasselbe Integral, erstreckt iiber irgend eine 

andere, die Parabel in den Endpunkten 1 und 2 von n — 1 ter Ordnung be- 
65 riihrende Curve mit bestandig | wachsender x-Coordinate und stetig sich 
andernden Ableitungen bis zur n — 1 ten Ordnung. 

Ausser der Moglichkeit eines Minimums (fiir m ^ n— 1) auch fiir rationale 
Functionen F zeigt dieses Beispiel, dass in der Definitionsgleichung (37) unter 
Umstanden auch m < n — 1, hier namlich m — 1 bei n > 2, angenommen 
werden darf, dass also fiir die Ordnungszahl m der Nachbarschaft. nicht etwa, 
dem Satze IV (r ^ n— 1) analog, eine von 0 oder 1 verschiedene untere Grenze 
allqemein festqestellt werden kann. 

Der Unterschied der Falle n = 1 und n > 1 tritt noch klarer hervor, wenn 
man gemass 

F (x M ,y { ^ = f x,y^x' , = (^ = 0, 1 ,...n) (20) 

auch die Function E einer entsprechenden Umformung unterzieht. 

Fiir n = 1 namlich ist: 



F{x,y;x',y’) 



f(x,y,yi)x' 



f 




dF = f _dll dF = dl = ,,, , 

dx' dy\ x' ’ dy' dy\ 1 

dF 8F 

ai + 

= /■($'- x') + /' (yi) x ') + v' - yf 

f ,-t /n , , y'x' -x'y'_, 

= f-{x - x) + f (y i) X 

XX 

= f ■ - x') + f (yi) (y 1 - j/i ) x , 

F (x, y; x', y') - F (x, y; x', y') = f (x, y,yi)x' - f (x, y, yi)x' . 



Also ist nach (73): 



E{x,y\x',y'\x',y') 



{/ (x, y, Vi) - f {x, y, yi) - {y 1 




(87) 



66 



und wechselt fiir eindeutige Functionen / sein Vorzeichen, wenn | man x' , y' 

— — — y' 

durch —x 1 , —y' ersetzt, wobei y-, = ^ unverandert bleibt. 




Investigations in the calculus of variations 



131 



i.e., the two curves would coincide. Hence: 

^2 

The integral f y f dt taken along a segment of a parabola of 2n — 1 th order 

tl 

is always smaller than the same integral taken along any other curve that has 
contact of n — 1 th order with the parabola at the endpoints 1 and 2 and has 
continuously increasing x coordinate and continuously varying derivatives up 
of the n — 1 th order. 

What this example illustrates, besides the possibility that rational func- 
tions F, too, have a minimum (when m ^ n — 1), is the fact that, in the 
definition equation (37), we may also assume m < n— 1, namely, in this case, 
to = 1 when n > 2, and hence that it is not possible, along the lines of 
Theorem IV (r ^ n — 1), to generally find a lower limit different from 0 or 1 
for the order number m of the neighborhood. 

The difference between the cases n = 1 and n > 1 becomes even more 
evident once we, in accordance with 

= f(x,y^)x' , ^ (^ = 0,1 ,...n) (20) 

subject also the function ill to a corresponding transformation. 

For when n = 1, 



F (x, y; x', y) = f (x, y, yi) x’ 



f 




I 

X , 



dF = f _dlyL M = dl =l 

dx' dyi x’ ’ dy' dyi ^ 1 

dF dF 

= f-{x’ - x') + f (yi) (x' - x') + y' - y^j 



= / • ( x' - x') + f (yi) (y 1 - yi) x' , 

F (x, y; x', y') - F (x, y; x' , y') = f (x, y,yi)x' - f (x, y, yi) x' . 



Hence, by (73), 



E{x,y;x',y';x',y') 



| / (x, y, y x ) - / (a:, y, yi) - (y 1 




(87) 



and it switches its sign for single- valued functions / when a/, y' are replaced 
by — T / , — y ; , where iji = zz j remains unaltered. 




132 Zermelo 1894 



Fiir n > 1 dagegen ist nach (82): 



x'y^-y'xW , ( (u) 

y» = — — + r i H „_ 1 > 



X' 






2G = 



<9F 



x ?! y x + ( X M V W\ 

x ,n+l 

y' df , dF x' df , 



da» a; ,n + 1 ay„"‘ T ’ dy (") x' n + 1 dy, 

( x (n) _ (n)\ , d-F / _(n) _ (n)'l 

a*(") v J + a^oo v y y ) 

_ 0/ f - y':r (n) _ a/yt") - y'gW \ ^ 
<9y n | £ m + 1 

df , , 



■a: , 



r /n + 1 



Also wird: 



E{x^\y^-,x^\y^)=E{y n ,y n ) (88) 

= {/ (y„) - / (j/n) - /' (j/n) (Vn - Vn)} x' 

bei Fortlassung der Argumente x^\ y^) (/x = 0, 1, . . .n — 1). 

Hier enthalt der zweite Factor x' keine der willkiirlichen Grossen x^ n \ 
y( n \ so dass der friihere Schluss nicht mehr anwendbar ist, wahrend der erste 
Factor ebenso wie der in (87) sehr wohl ein von x^ n \ y ^ unabhangiges 
Vorzeichen besitzen kann. 

Setzt man namlich y n = z, y n = z, so wird die Function 
E 2 (z,z) = f(z ) - f(z) - (z-z)f'(z) 

an einer Stelle 3 dann und nur dann fiir alle Werte von z positiv sein, wenn 
die durch w = f(z) dargestellte Curve ganz oberhalb der zum Punkte z = z, 
w = w = f(z ) gehorigen Tangente verlauft. 

67 | Dazu aber ist 

1. notwendig, dass sie an der Stelle z, w nach oben gekriimmt ist: 
f"(z) > 0, 

2. hinreichend, dass sie allenthalben nur nach oben gekriimmt ist, 
f"(z) ^ 0, und ihre Tangentenrichtung f(z) nur stetig verandert. 

Fig. 3 Offenbar ist weder die erste Bedingung zugleich hinreichend noch auch 

die zweite zugleich notwendig. Doch geniigt schon die erste Bedingung, wenn 
E 2 (z, z) nur fiir alle z einer gewissen Umgebung von z positiv zu sein braucht. 




Investigations in the calculus of variations 



133 



When n > 1 one has, on the other hand, by (82), 



Vn = 



x'y ("> - y'x^ 

j.fn + 1 

x'yW - y'x^ 



(x (M) ,V (M) ) n _ 1 , 

dF y' df , dF x' df , 



dx ( n ) 



x ,n + 1 dy n ’ ch/(") x' n + 1 dy, 

a*(") v x ) + %(") V y y ; 

df \ x'y^ - y'x^ a/yW-y'gW ] y 

| 



-a; , 



r /n + 1 



r /n + 1 



df x , 

= Wn (v "~ v " )x - 



And hence 



=E(y n ,y n ) (88) 

= {/ (j/n) - / (f/n) - /' (y n ) (Vn ~ Vn)} x' 



when the arguments x^\ y ^ (y = 0, 1, . . . n — 1) are omitted. 

In this case, the second factor x' contains no arbitrary quantities x^ 71 ) , yd 1 ) 
so that the previous conclusion is no longer applicable, whereas the sign of the 
first factor, as well as that of the one in (87), may certainly be independent 
oix {n \y {n \ 

For if we set y n = z, y n = z, the function 



E 2 (z, z) = f(z) - f(z) -(z- z)f'(z) 



is positive at a position z for all values of z if and only if the curve represented 
by w = f(z) runs entirely above the tangent belonging to the point z = z, 
w = w = f(z). 

But for this to be the case, it is 

1. necessary that it bends upwards at the position z, w: f"(z) ^ 0, 

2. sufficient that it only bends upwards everywhere, f"(z) ^ 0, and that 
the direction of its tangent f'(z) changes only continuously. 

Fig. 3 Evidently, neither is the first condition also sufficient nor the second also 

necessary. But the first condition already suffices if E 2 (z,z) only needs to be 
positive for any z in a certain vicinity of 2 . 




134 



Zermelo 1894 




Fig. 3. 



Diese Regeln ergeben sich analytisch aus der zu (78) analogen Umfor- 
mung: 



£ — 1 



E 2 {z,z) = [f(z e ) + (1 - e)f'(Ze) (z -z)] 



£ = 0 



1 

O^-^) 2 / (1 -e)f”(z e ) ds 



= 2 &-z?f"{z„) (0<x<l), 



wenn z + e(z — z) = z e gesetzt wird, also: 

E (^x^\y w \x^\y { ^ = E 2 {y n ,y n ) x> 
1 

= x> {Vn ~ Vnf J i l ~ £ )/2 ( X > 2 //i) Z e) de , 



wo: 

68 I und 



/2 (x,y»,y n ) = 



d 2 f{x,y^y n ) 



9y‘n 

z £ =y n +s {y n - y n ) 



(89) 



Der wahre Grund des abweichenden Verhaltens von E in den Fallen n = 1 
und n > 1 liegt, (87) und (88) zufolge, darin, dass im ersten Fall der durch 




Fig. 4. 




Investigations in the calculus of variations 135 




2 2 



Fig. 3. 



These rules analytically arise from the transformation along the lines 
of (78): 

E 2 {z,z) = [f{z £ ) + (l-£)f , (z £ )(z-z)]lZ 1 0 



1 

= (z- z) 2 Ji 1 - e)f" Oe) d£ 



= 2 i*-z) 2 f"{z„) (0<*<1), 



when we set z + e{z — z) = z e , and hence 



E (x^ , y ( ^ = E 2 (■ y n , y n ) x' 

1 

= x ' (Vn - Vnf j e)f 2 ( x , y^, z £ ) de , 



, t r \ d 2 f (x,y^,y n ) 

where / 2 (x, y fl , y n )= — 

9yl 

and 

z £ =y n + s (y n - y n ) ■ 

According to (87) and (88), the true reason for the deviant behavior of E 
in the cases n = 1 and n > 1 lies in the fact that, in the first case, the 




Fig. 4. 




136 Zermelo 1894 



(62) definierte Punkt 4 auf derselben Curve a von 3 aus nach entgegengesetz- 
ten Richtungen fortschreiten kann, wobei die x', y' und mit ihnen auch E 
entgegengesetzte Vorzeichen annehmen, wahrend fiir n > 1 dies nicht mehr 
gestattet ist, da nach Satz IV die Ableitungen bis zur Ordnung n — 1 ^ 1 als 
stetig vorausgesetzt werden rniissen. 

Die Formeln (87) und (88) kann man auch direct aus der Definition von E 
ableiten, wenn man sich des bekannten Ausdrucks fiir die „erste Variation 11 
mit x als unabhangiger Variable bedient, dessen Entwicklung hier noch kurz 
wiedergegeben werden moge. 

Es ist nach (20): 

F (x M , y (M) ) = / {x, j/ M ) x’ , 

y 

wenn — als Function der x^ v \ {v 5) /x) aufgefasst wird. Betrachtet 
man hier £ = 5x und rj = Sy als unabhangige Variationen, so wird: 

, ,yl- 1 Wn-i y'n-itx' dSx 

d 

fyn - V>x, + iSx = — ( Sy M -i-y fl 5x) = — ( Sy - yxSx) . (90) 

Nun ist: 

69 | $F = 5{f (x,yu)x'} = fSx' + x'^5x + x’ ^ 

= fSx 1 + 5x + x' ^2 - V^ + idx) 

at v=o ay v 

= 1 (/ fa > <%-».&) ■ 
n = 0 

Es ist aber, wenn man fiir den Augenblick x als unabhangige Variable 
ansieht, nach einer zu (42) analogen Umformung: 

A df dP .. ~ . d A r cP " 1 .. 

= * V (Sv - v&) 

11 = 0 fi= 1 

+L (Sy — yiSx) , 





Investigations in the calculus of variations 



137 



point 4 defined by (62) may move along the same curve a from 3 in opposite 
directions, x! , y' , and E with them, assume opposite signs, whereas this is 
no longer permissible when n > 1 since, by Theorem IV, it is to be assumed 
that the derivatives up to order n — 1 ^ 1 are continuous. 

We may also derive the formulas (87) and (88) directly from the defini- 
tion of E by using the known expression for the “first variation” with x as 
independent variable, whose expansion will be outlined here briefly. 

By (20), 



F = f(x,yy)x' , 



when 
and 77 = 



y 

— — is considered a function of x^ v \ y^ (v ^ p). Considering £ = Sx 
dx p 

Sy as independent variations, we obtain 



_ 1 _ Sy^_ 1 _ y' tl _ ^x' _ dSy^-i _ cttx 

x' x' x' 2 dx dx 

d d ^ 

~ Vu+i Sx = ^ - y^ x ) = {&y - yiSx) • (90) 

Now we have 



SF = 6 {/ (x, y^) x'} = fSx' + x'^-Sx + x' 22 



df 



dx 



/i — 0 



dVr 



= fSx + ^ Sx + x ' 22 (fyn ~ Vu+i Sx ) 



li — 0 



= s (/fe ) + *'£ §i&^ Sv ~« i5x) 

n = 0 



Considering x as an independent variable for the present, we have, by a 
transformation along the lines of (42), 



jr^^(6y- yi Sx) = (Sy yi Sx) 

^ dy u dx » v y y ' dx ^ ^dx^- 1 y y y ’ 

fi — 0 ^ fi — 1 

+L(Sy~y 1 5x) , 



where: 



is set. Hence 



n — /i 



W = E(-D 



i = 0 



i dS Of 
dx 1 dy 



n + i 



(A* = 0, 1 , ... ri ) 



d v 

SF = ^ fSx +22 L r ( s Vn - 1 “ VtiSx) + L (Sy - yiSx) x' . 



M — 1 




138 Zermelo 1894 



1st nun x = (fit), y = ip(t) eine Losung der Differentialgleichung des 
Problems und im Intervall t\ ^ t ^ £3 von singularen Punkten frei, so ver- 
schwindet wegen L = 0 das zweite Glied und es wird: 



5Jq3 = 



n 

f5x+ Y L n (%- 

M= 1 



VixSx) 



- *3 

- to 



Wenn aber x + eSx + (£) 2 , V + sdy + (e )2 die Coordinaten der Curve a e 
oder 0 4 und x, y die von a oder 4 3 darstellen, in der im Anfang des Ab- 
schnittes definierten Bedeutung, so wird: 



fiir t = to'- Sx = 0 , Sy^ =0 (/j, = 0, 1 , . . . n — 1) , 

weil a und a e einander in 0 von n — 1 ter Ordnung beriihren; 



70 | fiir t = t 3 : Sx = -x' , 5y^ = -y ll + l x' (/x = 0, 1 , . . . n - 1) , 

wo x ' , y die Werte dieser Functionen im Punkte 3 von a bezeichnen. Es wird 
daher: 



= £ 



J t = * 3 



Joi — Jl 13 — sSJq3 + (e) 2 

n 

- / {x, y M ) x' - Y L n (vn ~ y m) x'\ 

n=l 

= -ex' [f (x, y^) + L n (x, y (y n - y n ) ] + (e ) 2 , 

y^=y^ (/z = o, i,...n-i) . 



(^2 



wegen 

Da ferner 



so wird schliesslich: 



^43 = / (a:, Vu) x'e + {e) 2 
df 

und L n (x,y tl ) = - — ist, 
oy n 



Jo 43 — Jo3 ~ sE + (s) 2 

Q j 1 

/ (*> y M ) - / ( x , j/ M ) - fy-iVn- yn) I + (a>2 , 



/ (*, 2/. 2/i) - / (®, 2/, 2/1 ) ^ d ’f — (2/1 - 2/1) 



also fiir n = 1 
E = 

l ' “ 

Fiir n > 1 aber wegen 5f / = x 1 : 



E = 



df ^ 



f (x, l/ju) - / (A, 2//x) - Q^-iVn- Vu) 



(87) 



(88) 




Investigations in the calculus of variations 



139 



Now if x = ( p(t ), y = ip(t) is a solution of the differential equation of the 
problem and without singular points in the interval t\ ^ t ^ t 3 , then, on 
account of L = 0, the second term vanishes, and 



SJoz = 



n 

fSx+ ^2 Lfj, ( Sy 

M — 1 



2/^a;) 



■ *3 
- *0 



But if x+£(5x+(£)2, 2/+£<52/+(£) 2 represent the coordinates of the curve a e 
or 0 4, and x, y those of o or 4 3, in the sense defined at the beginning of 
the section, then 



when t = to- Sx = 0 , Sy M = 0 (/r = 0, 1, . . . n — 1) , 

since a and a £ have contact of n — 1 th order at 0 with one another; 

when t = t 3 : Sx = -x' , = —y^ + 1 x' (n = 0, 1, . . . n - 1) , 

where x' , y denote the values of these functions at the point 3 of a. Thus 



Jq 4 — <A)3 — eSJo3 + ( e )2 



= £ 



- / (x, j/ M ) x' -^2 Lf, ( Vfl - y2 x' 

M = 1 



t — t3 



(e)s 



= -ex' [/ (x, t/ M ) + (x, ) (t/„ - y„) ] + (£) 2 

on account of y^ = (p, = 0, 1 , . . . n — 1) . 

Since furthermore J43 = / (x,y„) x'e + (e )2 

df 

and L„ (x, y M ) = - — , 

oy n 

we finally have 



•A)43 ~ ^03 — + (s)2 

Qf 1 

/ (®, - / (®» 2/m) - (j/n - J x'£ + (£) 2 , 



and hence when n = 1 
E = 



f (X, y, yi) - f (x, y, yi) - d ^ x ' V,Vl ^ ( 2/1 - yi) 



dyi 



But when n > 1, on account of x' = x 1 , 



(87) 



E = 



f (x, V2) ~ f 0 > Vu) ~ (Vn - Vn) 



(88) 




140 Zermelo 1894 



In derselben Weise kann man auch den zur Darstellung: 
F (ajM.j/M) = ${x,y,otn) \J x' 2 + y' 2 , 

S2 

J = J @{x,y, a M ) ds , 



(26) 



gehorigen Ausdruck fiir I? bilden. 

Hier wird fiir n > 1 ahnlich wie (82): 



d n ~ 2 a 2 _ d n ~ 2 x'y" - y'x” 
ds n ~ 2 ds n ~ 2 /, j ,/2 _|_ yl 2 j I 

T r n( n ) — i/'r( n ) 

{.A 2 • ,/-’) 5 ' 



9F 

d x ( n ) 



y' d<P 



( x ' 2 + y' 2 ) 2 ® an 
dF x' dd> 



\J x' 2 + y' 2 , 



dy( n ) r x /2 da r 



\Jx ' 2 + ; 



(x( n '> - x (n) ) + ^ ( 7/(^0 - y( n )) 

a*(") 1 j dyw [y y ’ 

x'y^-y'x^ x'y^-y'x^ \ dd> 



{x' 2 + y' 2 ) * (a;' 2 + y' 2 ) 

dd> 



— \ J x ‘ 2 _|_ 



= (a n - a n ) - — i/ £ /2 + y' 2 . 

OOi n 

Also, wenn 

F = F (x^\y^) = <d>\Jx' 2 + y' 2 = <P (x, y, a M ) x' 2 + y' 2 
(^(m) =2; (m) ? y(tJ-) — ^ = a M fiir y = 0, 1 , . . . n — l) 



gesetzt wird, so ist nach (73): 



d$ 



E = E(x, y, a ft] a n ) = < £ - - - — (a n - a„) } — . (91) 






ds 



dt 



Hier ist der erste Factor von derselben Form wie der in (88) und (89), 
es gelten also fiir ihn dieselben Betrachtungen bezriglich des Vorzeichens wie 




Investigations in the calculus of variations 141 

In the same way, it is possible to form the expression E belonging to the 
representation 



F (x^\y(F)) = F(x,y,a^,) \J x' 2 + y' 2 , 



J = 



S2 

J F(x,y,a^) ds . 

Si 



(26) 



In this case, for n > 1, similar to (82), 

d n ~ 2 a 2 d n ~ 2 x'y" - y'x" 



ds n 



2 {x' 2 + y ' 2 ) 2 



rr'a,( n ) _ n.'rrin) 

XJL V ~^rF{x^\y^) , 

0 x' 2 + y 12 ) — 



y' dF , 

— V x +y . 



dF 

dx( n ) ( x /2 + y/2 ) tIt da n 
dF x' dF 



dy (n ) ( x >2 + y /2y i r da n 

dF 
d x ( n ) 



sj x' 2 + y' 2 , 



(®<"> - 


* ( ”’) + Oyt) ^ 


1 - y { 


yf x (n) 


x'yW - y'xW \ 


dF 


n+1 

,/2) — 


n + 1 ( 

(. x ' 2 + y' 2 ) 2 J 


dc%n 



= (a n - a n ) \J x' 2 + j/' 2 . 



Hence, if we set 



F = F (x^\ y^) = F\Jx' 2 + j/' 2 = 0 (x, y, tt M ) a;' 2 + y' 2 
(jKm) = a;(^) , y(ti) = for y = 0, 1, . . . n — l) 



then, by (73), 



<90 



E = E(x, y, a^; a n ) = <{ 0 - 0 - (a„ -a„)[-. (91) 



ds 



dt 



Here the first factor has the same form as that in (88) and (89), and hence 
it is subject to the same considerations with respect to its sign as E 2 and 




142 Zermelo 1894 



fur E 2 und auch die analoge Umformung: 



1 




£■) ds 



(0 < x < 1 ) , 



(92) 



I wo . 0 den Wert von 7 — r- bezeichnet, wenn a n + e (a n — a n ) an Stelle 
da 2 e _ da 2 n 

von a n gesetzt wird. 

Der andere Factor — = sj x ' 2 + y ’ 2 aber ist unter alien Umstanden posi- 



tiv. 



Auch die Formel (91) lasst sich direct aus einer neuen Form der ersten 
Variation ableiten, deren Entwicklung hier vollstandig Platz finden moge, da 
die Darstellung (26) bisher in der Variationsrechnung, soweit mir bekannt ist, 
keine Anwendung gefunden hat, fiir viele, namentlich geometrische Aufgaben 
jedoch manche Vorziige zu besitzen scheint. 

Nach (26) ist: 



SF = 6 




^ c ds 

= <M * + 



0<p d<P n 
~k~Sx + -K~Sy + J2 £ M 5a M 
dx dy ff = 1 



ds 

dt 



(93) 



wenn — — = gesetzt wird. Statt der unabhangigen Variationen 5x und Sy 

oa^ 

fiihren wir die neuen Sv und Sw ein: 



Sv = cos anfe + sin a^ch/ 
Sw = sinai&r — cos a\Sy , 



Sx = cosai(5u + sinaifon 
Sy = sin«i(5u — cosaiSw 



dSx dSv . ddw . . . 

—— = cosai — 1 - sinai- 1 - a 2 (— smaiou + cosaiow) , 

ds ds ds 



dSy . dSv dSw . 

— — = smai— cosai — 1 - a 2 (cosaiou + smaiiio) , 

ds ds ds 



dSs 

ds 



S\Jx' 2 + y' 2 x'Sx' + y'Sy' 



\Jx' 2 



r /2 



y 



= cosai- 



dSx 

ds 



sin ai 



dSy 

ds 



8a i = 5 arctg — = 
x' 



dSs dSv 

-r- = \-a 2 Sw , 

ds ds 

y x'Sy' — y'Sx' dSy . dSx 



x' 2 + y' 2 



= cos ai — sin ai — — 

ds ds 



(94) 



(95) 




Investigations in the calculus of variations 



143 



also the analogous transformation: 



P ds 2 f d 2 <P e 

E=—{a n -a n ) J-^-{l-e)d£ 



(92) 



_ 1 ds 2 d 2 ®* 

~2dt K an) dal 



(0 < XT < 1) , 



where 
for a n . 






d 2( P 

denotes the value of 7—7 when we substitute a n + e 
da 2 n 



The other factor — = \J x' 2 + y' 2 , however, is always positive. 






an) 



The formula (91), too, is capable of direct derivation from a new form of 
the first variation whose development shall be given here completely, since 
the representation (26) has, to the best of my knowledge, never been applied 
before in the calculus of variations, but appears to have many an advantage 
for a great number of problems, and in particular geometric ones. 

By (26), 



5F = S 





d<P d<P n 
- 5x + -6y + 



ds 

dt 



(93) 



d<P 

assuming we set -7 — = d> u . Instead of the independent variations Sx and Sy 
we introduce the new ones 5v and Sw : 



Sv = cosauckr + smooch/ 
Sw = sinaife — cos a\Sy , 



Sx = cos aiSv + sin ot\8w 
Sy = sinaii5u — cosaii5w 



(94) 



dSx dSv . dSw . . . 

— — = cos a\ — h sin a\ — h «2 (— sin a\Sv + cos a\ Sw) . 

ds ds ds 

dSy . dSv dSw . . 

— — = smai- cos a\ — : 1- 0:2 (cosaiou + sinaiom) , 

ds ds ds 



dSs 

ds 



Sy/3. 



■y 



x' Sx' 



y/x' 2 



„/ 2 



„/ 2 



- y'Sy' dSx 

, 9 ‘ = cosai— — 

y lz ds 



smai 



dSy 

ds 



5a\ = S arctg — = 
x' 



dSs dSv 
- 7 - = - r - + a 2 Sw , 
ds ds 

y x’Sy ' — y'Sx' dSy . dSx 



= cos a\ — sin a\ — — 

ds ds 



(95) 




144 Zermelo 1894 



r dSw 

oa i = b a 2 av 

as 



dd> d<P 
— = — cosai 
as cte 



5 >. 



+ 1 



Nun geht (93) liber in: 
5 ((fids) f dSv 



= <d> — b a 2 Siv I + — (cosandu + sinaidw) 



+ — — (sin andu — cosaidud + 
ay z — ' 

y p = i 



6(d>ds) d 



ds ds 



Nun ist aber fiir /i ^ 2: 



= — (<P8v) + <Pa 2 + — — sin on — — cos a\ I Sw 



+ ^2 ( 8a M - a^ + i5v) . 



dct^ — i Sdctfi — i dcx^ — iSds 
a>l ds ds ds ds 



- a« — — + a 2 Sw 



dan — i 



Sa M - a M + idu = — (da M _i - a M di>) - a^a 2 Sw . 

Also ist flir eine beliebige Function A M von t oder s: 

A (da M - cc M + idu) = (Sa^ _ i - a M di>) ] 

~~^s~ ^ a ' x ~ 1 ~ a ^ v ) ~ A^a^Sw . 
Werden aber die A M der Reihe nach definiert durch die Gleichungen: 

A^ H i (m = 1j 2 A„ + i=0) , ( 



^ — E( *)* ’ 



so geht die vorhergehende Gleichung durch Hinzufiigung von 

— ^AA I San — au + iSv) auf beiden Seiten iiber in: 
ds 

@n (Sa^ - a M + 1 Sv) = -^ [A^ (. Sa M _ i - a M du) ] 

i + 1 / c r \ dAp . 

H ^ — [oa^ - a^ + idv) — (da 



// — i a^du) A^a^a 2 5w , 




Investigations in the calculus of variations 



145 



r dSw 

for = h a 2 dv 

as 



(96) 



dd> c)4> 

— = — coscti 
ds Ox 



Now (93) is transformed into 



<9d> 

dy 



smai 






Ctp+ 1 



M = 1 



5(<P ds) 
ds 



= d> — b a 2 Sw ) + — — (cosaqfo + sinaqfo;) 

ox 



dSv 
ds 
d<I> 
dy 



d$ 



+ (sinaidu — cos audit; ) + <£pfop , 

M = 1 



8 (d>ds) d 



ds 



ds 



d$ 



<9d> 



= — (<P5v) + <Pa 2 + -x— sinai — — cosai ) Sw 



dx 



dy 



(97) 



+ ^2 (Sa^ - Ofi+x Sv) . 

M = 1 



But now for /i E 2 



, _ dan — i Sdan _ \ dan — i Sds 

oa u = o- 



^ ~ ds 

da ft — x 
ds 



ds 

( dSv 



ds ds 



- a^ + a 2 Sw 



Sa^ ~ a.fi + \8v = — (■ 5a p - 1 — a^Sv) — a^a^Sw . 
Thus, for any function A M of t or s, 



Afi ( 80 ^ - a^ + iSv) = — [Ap ( Sa M _ i - a^Sv) ] 



cL4p 

ds 



(80^ - 1 - a^v) - A^a^Sw 



But if the A^ are defined in succession by the equations 



A n 



dA 



fA + 1 



ds 



= @fi, (n= l,2,...n, A n + 1 =0), 



(98) 



n—fi 



d l 






2 = 0 



then, by adding 
tion, we obtain 



dA 



/2 + 1 



ds 



(da^ -a M+ ifo) on both sides of the previous equa- 



d? fi (50^ - afi + iSv) = — [A M (fop _ i - a^Sv)\ 

— (fop - ttp + ifo) E (fop-1 - ttpfo) - Ap«pa 2 fo 




146 Zermelo 1894 



und wenn man auf beiden Seiten liber /jl = 2, 3, . . . n summiert und das died 
hinzufugt: 



<Pi (5ai — ct2Sv) = ( A\ H ) (dai — ct2dv ) , 

V ds J 

n d n 

y. {Safj, - + iSv) = — y ( 5a M _ i - a^5v) 



1 



(99) 



/i — 2 



+ A\ (dai — CX 26 V ) — y^ A^ l a tl a 2 Sw . 

fi — 2 

Hier ist aber nach (96): 

. , c c , . dSw d dA\ 

A 1 (dai — Q2 ov) = — A\ — — = — — (Aidw) H ; — ow , 

ds ds ds 

so dass nun aus (97), (99) und (100) durch Addition folgt: 



(100) 



5(<Pds) _ d_ j _ AiSw + y A^ (Sa^ _ 1 - a^Sv) | (101) 



ds 



^ = 2 



dd> . dd> / a \ 

+ •( — sin «i - — cos ai + a 2 1 <P - 2^ A v a v J 



cte 



<9y 

(MAT 

ds 



p = 2 



dAi 

ds 



Sw 



+ G'Sw 



( 2 ) ( 2 ) 

4.7 = S J <Pds = [(57\']^| + J G'dwds , wo: 

(i) (i) 

<90 <90 d j 4i 

G' = — sincii - — cosai + a 2 A 0 + — — = 0 (102) 

ax ay as 



A 0 =$-yA^ 



OLu 



li — 2 



die Differentialgleichung des Problems ist und Ao, Ai, . . . A ra , die Coefficien- 
ten in: 



n 

SK = AqSv — A\5w + A^Sa^ _ 1 , 

fjb — 2 

die Ausdrlicke sind, welche nach einem zu III analogen Satze immer stet.ig 
bleiben miissen flir alle Curven a , denen ein Minimum entsprechen soil. Die 
Beweise werden analog geflihrt wie im zweiten Abschnitt flir die dort zu 
Grunde gelegte Darstellung. 




Investigations in the calculus of variations 



147 



and, by summing over /i = 2,3, ...non both sides and adding the term: 



<Pi ( 8 a\ — ct 2 Sv ) = ( A\ + — r~ 1 ( 8 oti — ct 2 Sv) , 

V da J 

n d n 

Y. ( 8 a M - a^ + i 8 v) = — y ( 8 a M _ 1 - a^Sv) 



M — 1 



[i — 2 



+ Ai (5a 1 — a 28 v) — A fJj a fJj a 28 w . 

li — 2 



But here, by (96), 



, , r . . dSw d dAi 

A 1 (o«i — a 2 dv) = — Ai — — = — — (A\dw) H ; — 8w , 

ds ds ds 

so that it now follows from (97), (99) and (100) by addition that 



ds 



li — 2 



d$ 






+ < tt- sin a\ — — cos a\ + a 2 [<P — 



dx 



dy 

d5K 

ds 



n 

($-y -v*,,) 



[i — 2 



dAi 

ds 



(99) 



(100) 



5($ds) _ d | _ AjSw + y A^ (Sa^ _ 1 - a^Sv) 1 (101) 



Sw 



G' 8 w 



( 2 ) ( 2 ) 

SJ = 8 J <Pds = + J G'Swds , where 

(i) (i) 

d<P . d<P dAi 

(j — — — sinai — 7T — cosai + a 2 -Ao H ; — — 0 

ox oy as 



Ao = & - ~y A^ 

M — 2 



(102) 



is the differential equation of the problem , and Ao, ^4i, . . . A n , the coefficients 
in 



n 

5K = Aq5v — A\ 8 w + y^ A^Sa^ _ i , 

M — 2 

are those expressions that, according to a theorem along the lines of III, 
must always remain continuous for all curves a to which a minimum is to 
correspond. The proofs are carried out in the same way as those for the 
representation underlying the second section. 




148 Zermelo 1894 



Fur den Fall n = 2 z. B., wo a\ = a den Winkel zwischen Tangente und 
x-Achse, a 2 = k die Kriiinrnung bedeutet, wird: 



A 2 — @2 



F = F(x,y,a,k) 



ds 

dt 



dF 
~dk ’ 



A 0 = F - F 2 k , 



A 1 



d<P 

da 



d dF 
ds dk 



also 



dF 



dF 



G' = — - sin a — —— cos 



dx 



dy 



, , , dF d dF 

k *- k di f + SJaS 



d^dF 

ds 2 dk 



SK = 




Sv — 




ddF\ dF 

T s m) Sw+ dk Sa 



Zur Berechnung von E nehmen wir an, dass sich der Punkt 4 unserer Fig. 2 
auf der Curve a urn das Bogenelement a von 3 aus riickwarts verschoben habe, 
und es seien x + aSx + (a) 2 , y + crSy + (a) 2 die Coordinaten von 0 4 oder a e , 
wenn die entsprechenden von a oder 0 3 mit x, y bezeichnet werden. Dann 
ist fiir den Punkt 0: 



5x = 0 , Sy = 0 ; Sv = 0 , Sw = 0 , 

5a\ = 0, . . . 5a n _ 1 = 0 , also SK = 0 , 

76 | fiir den Punkt 4 aber, den wir durch Variation von 3 entstanden denken: 

dx _ d y . _ 

Sx = — — = — cos «i , dy = — — = — sin a\ , 
ds ds 

Sv = — cos (on — ai) , Sw — sin (a-i — ai) , 
und fiir n > 1 wegen (fi < n): 

Sx = — cos ai , Sy = — sin a\ , Sv = — 1 , Sw = 0 , 

Sa ^ — 1 a^ — n ( a n cn n ) (/r — 2,3,... ti) , 

n 

SK — F ^ d? n (cH n CT n ) , 

M — 2 

wenn die gestrichenen Buchstaben die auf a, die ungestrichenen die auf a 
genommenen Werte der Ausdriicke im Punkte 3 bezeichnen. 

Es wird nun nach (101), da a der Gleichung G' = 0 geniigt: 

SJ 03 = [SK}f 0 ] = -F-F n (: a n - a n ) 
und J 043 — J 03 = Jo 4 — Jo 3 + J 43 = crSJo 3 + Fa + (a) 2 
= a {F - F - F n (a n - a n ) } + (a) 2 
ds f— dF 1 

= £ dtr- i ‘-^ ia "- a " ) s +{ch ' 




Investigations in the calculus of variations 



149 



For instance, for n = 2, 



ds 

F = $(x,y,a,k) — 
at 



dd> 



An = tPn — ~nz~r , An = tP — (Pnk 



dk 



2K , 



d<P d d<P 

A\ = "tt > hence 

oa ds Ok 



d@ . d<P , d<P d d<P d 2 dd> 

G = “ ^“ cosa + K?P — K — — + — 

ctr ay afc as act ds z ok 



d<P , 



9fc 



ck£ d d<P 



SK = <? - — k (5u - - — T-r Uw + Tcr<5a 



9a ds <9fc 



9<Z> 



9fc 



where ai = a denotes the angle between tangent and x axis, and ai = k the 
curvature. 

To evaluate E we assume that point 4 in our Fig. 2 has slid backwards on 
the curve d from 3 by the element of arc a. Let x + aSx+ (<7)2, y + crdy + (17)2 
be the coordinates of 0 4 or a £ , where the corresponding ones of a or 0 3 are 
denoted by x, y. Then, for the point 0, 



Sx = 0 , Sy = 0 ; Sv = 0 , Sw = 0 , 
5a 1 = 0, . . . 5a n _ 1 = 0 , also SK = 0 , 



but for the point 4, which we take to be generated by variation of 3, 



r dx _ 

ox = — — = — cos ai , 
ds 



x d y ■ - 

oy = — — = — sinai 
ds 



Sv = — cos (a;i — a\) , Sw = sin (ai — a±) , 
and for n > 1, on account of a M (y < n), 

Sx = — cos ai , Sy = — sin ai , Sv = — 1 , (kc = 0 , 



(5ai^ _ i — OLfi — n (a^n o^n) (m — 2,3,... 7i) , 

n 

— tP ^ ' A^e^ n (a n a n ) — *p cp n ( a n a n ) , 

where the barred letters denote those values of the expressions taken on a at 
the point 3 , the non-barred letters those taken on a. 

Since a satisfies the equation G' = 0, we now have, by (101), 

SJo3 = [<HT]jo) = -<£ - $n {An - £*„) 
and J 043 — J 03 = Joa — J 03 + J A 3 = c5Jo3 + d>cr + {a ) 2 
= a {<P - <P - <P n (a n - a n )} + (ct ) 2 
ds f— 0<P 1 

= c Tt\ i ’-' l '-¥^ {a '‘- a " ) S + ieh ' 




150 Zermelo 1894 



wenn e die Differenz der Werte bezeichnet, welche die unabhangige Variable t 
von a in 4 und in 3 annimmt, so dass: 

ds , . 

” s £ + (£)2 - 

Es ergiebt sich also wieder: 

Vierter Abschnitt. 

Aufstellung hinreichender Bedingungen. 



Es sei 1 2 ein solches Stuck einer Curve a: x = y = ip( A), dass in seiner 

ganzen Ausdehnung Ai ^ A ^ A 2 die Functionen Tp, ip mit ihren Ableitungen 
bis zur mindestens nten Ordnung stetig sind und nirgends ip'( A) und 'tp'(X) 
gleichzeitig verschwinden. 

Nun werde weiter vorausgesetzt, es gebe eine einfach unendliche stetige 
Schar U von particularen Losungen u der Differentialgleichung des Problems: 

X = <fi(t,u) = tp(t-,U 1 ,...U2n) /ggx 

V = 1p(t,u) = i> (t] Ui, . . . U 2n ) , ^ ’ 

u v =u v { A) {v = 1,2, . . . 2n) , 

welche in einem festen Punkte 0 alle mit einander eine Beriilirung n — 1 ter 
Ordnung eingehen und ausserdem die Curve a der Reihe nach in alien ihren 
Punkten 3 zwischen 1 und 2 von n — 1 ter Ordnung beriihren, zwischen 0 
und 3 aber sich ebenfalls bis auf Ableitungen n ter Ordnung stetig verhalten. 




Es giebt dann fiir jeden Wert von A des Intervalls eine solche Function 
A = A (t) = Aft; A) und eine solche Function: t' 3 = t' 3 (A), dass wie in (60) und 




Investigations in the calculus of variations 



151 



where e denotes the difference of the values assumed by the independent 
variable t of a in 4 and in 3 so that 



a = 



ds 

dt 



£ + (s)2 ■ 



Hence, we again obtain 

( f) <T> ^ /7e 

< 91 > 



Fourth section. 

Specification of sufficient conditions. 

Let 1 2 be a segment of a curve given by a: x = Tp{ A), y = ip{\) such that 
the functions Tp, ip and their derivatives up to at least the nth order are 
continuous on its entire extension Ai ^ A ^ A 2 , and Tp'( A) and ip (A) nowhere 
vanish simultaneously. 

Furthermore, let there be a simply infinite continuous family U of partic- 
ular solutions u of the differential equation of the problem 

X = ip(t,u) = (p(t-,U U ...U2n) 

y = 1p(t,u) = 1p (t\Ul, . . ,U2n) , ^ ’ 

u v =u v ( A) (u = 1, 2, . . ,2n) , 

all of which have contact of n — 1 th order with one another at a fixed point 0 
and, moreover, have contact of n — 1 th order with the curve a successively 
at all its points 3 between 1 and 2, but are also continuous between 0 and 3 
except for derivatives of nth order. 




Then, for any value of A of the interval, there is a function A = A (t) = 
\(t; A) and a function t' 3 = ^(A) such that, as in (60) and according to the 




152 Zermelo 1894 



in der dort angegebenen Bedeutung fiir t = t' 3 : 

D»m=^ ) (t' 3 )=TpM( t ' 3 ,u) 

(M = 0,l,...n-1) , 

wahrend das zu A — l gehorige particulare Integral u 1 die Curve u in 0 und a 
in i[x = Tp{ X — l), y = if>(\ — t)] von n — Iter Ordnung beriihrt, fiir i = 0 
aber sich auf u reduciert, also alien im vorigen Abschnitte an a e gestellten 
Forderungen geniigt. Daher ist die Formel (72) anwendbar und ergiebt, wenn 
die zusammengesetzten Integrale 

A>32 = J03 + J32 — S(A) , J042 = J04 + J 42 = S(A — t) 

gesetzt werden, 



S(A - t) - S{ A) = J 04 + J 43 - Jos 
= ^E(xM,yM;xW,yW) + (t ) 2 = ^E{ A) + (t ) 2 , 

wo ( t' 3 ,u ) , y M = V> (m) (t3.1t) , 

* (n) = D n Jp{ A) = cp[ n) (t's) , V {n) = D^{ A) = (4) 

anzunehmen ist. Es wird daher: 



S'( A) = lim 

t = 0 



S(A - t) - 5(A) 
—1 



E( A) 
A' 



also durch Integration: 



(104) 



A2 

S(A) - S(A 2 ) = J 032 - J 02 = J E( X)y , 

A 

A2 

/ j \ 

E(X)— . (105) 

Ai 

Sind im besonderen Falle die Curvenstiicke 0 1 und 0 2 Teile einer einzigen 
particularen Losung a oder 0 12, welche von a in den Punkten 1 und 2 von 
n — 1 ter Ordnung beriihrt wird, so ist wegen J 02 = J 01 + J 12 und X' > 0: 

A2 

79 | AJ =J- J = J{a) - J(a) = j E( A)^ > 0 , (105a) 

Ai 

wenn im ganzen Intervall Ai ^ A ^ A 2 bestandig E{ A) ^ 0 und nicht iiberall 
E{ A) = 0 ist. 




Investigations in the calculus of variations 



153 



meaning for t = 1 3 specified there, 

D»7p(\) =Jp^ (t' 3 ) = (t' 3 ,u) 

D^(X)=^\t' 3 )=^(t' 3 ,u) 

(/x = 0 , 1 , ... n — 1 ) , 

while the particular integral v! belonging to A — 1 has contact of n — 1 th 
order with the curve u at 0 and with a at A[x = Tp(\ — l), y = A — (,)] but is 
reduced to u when l = 0, and hence meets all requirements specified for a e 
in the previous section. The formula (72) is therefore applicable and yields, 
provided we set the composite integrals 

Jq 132 = J 03 + J 32 = 5(A) , J 042 = JoA + J A 2 = 5(A — i) , 



5 ( A - 0 - S( A) = J 04 + J43 - J03 
= ^ E ( X M,yM;x( n \yW) + (t) 2 = ±E( A) + ( 1)2 , 

where x^ = ip^ ( t' 3 ,u ) , y ^ (t^u) , 

® ( ”) = D n p( A) = (f' 3 ) , y(") = D^(A) = )/4 n) &) 

is to be assumed. Therefore 



S’{ A) = lim 

t = 0 



5(A - Q - S( A) 
—1 



E( A) 
A' 



and hence by integration 



(104) 



A 2 

5(A) - 5(A 2 ) = T 032 - J 02 = J E( A)^ , 

A 

A2 

/ 7 \ 

E( A)— . (105) 

Ai 

If, in particular, the curve segments 0 1 and 0 2 are parts of a single 
particular solution a or 0 1 2 with which a makes contact of the n — 1th 
order at the points 1 and 2, then, on account of J 02 = J 01 + J 12 and A' > 0, 

A2 

/ J A 

E(\)— >0, (105a) 

Ai 

provided that in the entire interval Ai ^ A 5= A 2 we always have E{ A) Si 0 
and not everywhere E{ A) = 0. 




154 Zermelo 1894 



Die Formel (105) mit der daran gekniipften Folgerung gilt auch dann noch, 

wenn d ^(A) in einzelnen Punkten des Intervalls endliche Spriinge 

erleiden, wie man sich durch Zerlegung von 1 2 in Teil-Intervalle leicht iiber- 
zeugt, vorausgesetzt nur, dass die Ableitungen bis zur n — 1 ten Ordnung 
und mit ihnen die u u tiberall stetig bleiben, d. h. die Losungen der Diffe- 
rentialgleichung stetig in einander iibergehen. Sei namlich z. B. 3 ein solcher 
Unstetigkeitspunkt, so wird, wenn er der einzige ist: 



/ J A 

E(X)-y 

Ai 

A2 

/ d\ 



^3 



also durch Addition: 



A2 

/ d\ 

E( A)— , q. e. d. 



Ganz analog verfahrt man fiir mehrere Unstetigkeitspunkte. a braucht 
daher nur den Stetigkeitsbedingungen (33) und (34) zu geniigen, und wir 
erhalten den Satz: 

Satz VII. Das iiber ein reguldres Stuck 1 2 einer Curve a, die der Dif- 
fer entialgleichung des Problemes geniigt, erstreckte Integral J besitzt einen 
grosseren Wert, als alle iiber solche erlaubten Variationen a erstreckten Inte- 
grate J, fiir welche eine Schar U von Losungen der Differentialgleichung der 
betrachteten Art existiert und fiir welche immer E{ A) ^ 0 und nicht bestdn- 
dig = 0 ist. 

Dieser Satz soil nun benutzt werden, um hinreichende Bedingungen fiir 
das Bestehen eines Minimums lrerzuleiten. 

80 | Zunachst handelt es sich um die Existenz einer solchen Schar U von parti- 

cularen Losungen der Differentialgleichung, wie sie oben vorausgesetzt wurde. 
Da die Curvenstiicke 0 3, 3 4 und 0 4 als Teile der Curven u, d und u' in 
denselben Beziehungen zu einander stehen wie im vorigen Abschnitte als Tei- 
le von a, a und a £ mit dem einzigen Unterschiede, dass jetzt auch a e = u' 
eine Losung der Differentialgleichung sein soil, so moge zunachst untersucht 
werden, unter welchen Bedingungen eine Curve a e mit den in (60) bis (63) 
angegebenen Eigenschaften existiert, welche ausserdem noch gleichfalls der 
Differentialgleichung des Problemes geniigt, also eine zu a „benachbarte Lo- 
sung“ derselben darstellt. 

Nach (59) muss eine solche Curve, wenn sie existiert, von der Form sein: 
x = ip(t, a + u) = ip (t; ai + u\, . . . a 2n + w 2n ) = <£ e (t) 
y = ip(t, a + w) = (f; ai + wi, . . . a 2 „ + u> 2n ) = 



(106) 




Investigations in the calculus of variations 



155 



The formula (105) together with its associated conclusion even holds when 

l p^iX ), ^ n \\ ) suffer finite jump discontinuities at some points of the in- 
terval, as can readily be seen from the decomposition of 1 2 into partial 
intervals, provided only that the derivatives up to the n — 1 th order, together 
with the u v , are continuous everywhere, i.e., the solutions of the differential 
equation are continuously transformed into one another. For if we suppose 
that, e. g., 3 is such a point of discontinuity, then, assuming it is the only one, 

A3 

/ d\ 

) 

Al 

A2 

J 03 + J 32 — J 02 = J E(X)— , 

A3 



and hence, by addition, 



J01 + J12 




q. e. d. 



The analogous procedure is used for several points of discontinuity. Thus, 
a only needs to satisfy the continuity conditions (33) and (34), and we obtain 
the theorem 

Theorem VII. The interval J taken along a regular segment 1 2 of a 
curve a satisfying the differential equation of the problem has a greater value 
than all integrals J taken along admissible variations a for which there exists 
a family U of solutions of the kind of differential equation under consideration 
and for which always E( A) ^ 0 and not always = 0 . 

We will now use this theorem in order to deduce sufficient conditions for 
the existence of a minimum. 

At first, we are concerned with the existence of such a family U of partic- 
ular solutions of the differential equation as it was assumed above. Since the 
same relations obtain among the curve segments 0 3, 3 4 and 0 4, whether 
they are parts of the curves u, a and u! or, as in the previous section, parts 
of the curves a, a and a E , with the only exception that now a e = u' , too, is 
supposed to be a solution of the differential equation, we shall, at first, inves- 
tigate the conditions under which there exists a curve a e with the properties 
specified in (60) - (63) that, in addition, satisfies the differential equation of 
the problem, and hence constitutes a “solution” thereof “neighboring” a. 

By (59), a curve of this kind, assuming it exists, must have the form 

x = c p(t , a + w) = ip (t; ai + u>i, ... a 2n + w 2 „) = <p e (t) 
y = ip(t, a + ix) =ip{t;ai+u)i,... a 2n + ^ 2 n) = 4>e{t) 



(106) 




156 



Zermelo 1894 



und fur passend bestimmte Werte der Grossen 



r^M=r, T<">(i 3 ) = T<»>, 

(/Li = 0, 1 , ... n — 1 ; = 1 , 2 ,... 2n) , 






die mit £ gleichzeitig unendlich klein werden iniissen, dem Gleichungs- 
Systeme (65) geniigen, das sich jetzt in der Form schreiben lasst: 



[ip (t + r 0 , a + w) - (p(t)] = 0 

( t = tn ) 

[ip ( t + r 0 , a + oj) - ip(t)\ = 0 
D»[ip(t + r 3 ,a + u) -<p(t)] =^ ) > 

D' i [ip(t + T 3 ,a + oj)-ip(t)]=r]^ ) 

(H = 0,l,...n-1) , 

wo wegen (60) auch die Grossen 

?3 M) = ^i M) (*3 - e) - ¥> (m) (is) = V { i ] (is - e) - ^ (is) , 
’?3 M) = (is - e) - (*3) = ih ) (is - e) ~ ipV (is) 



(107) 



(108) 



mit £ gleichzeitig unendlich klein werden. 

81 | Fiir hinreichend kleine |ro|, |w„| aber gilt die Entwicklung: 



ip{t + T 0 ,a + uj) = ip'(t,a)T 0 + '^2,tp v {t,a)u> v + (t 0 ,u u ) 2 

IS 

= <p'{t)T 0 + Y y v (t)u v + (t 0 ,U v ) 2 . 

IS 

Diese Potenzreihe kann man unter der Voraussetzung, class tq = To(i) 
in der Umgebung von t = to den Charakter einer ganzen Function besitzt, 
gliedweise differentiieren , wobei die Dimension jedes Gliedes in Bezug auf die 
Gesamtheit der Grossen Tq M \ ungeandert bleibt, und die so entstehen- 
den Reihen werden fiir hinreichend kleine Betrage dieser Grossen in einem 
Bereiche \t — to\ < d unbedingt und gleichmassig convergieren und die Ablei- 
tungen der urspriinglichen Reihe darstellen. Behandelt man ebenso auch die 
anderen, durch Vertauschung von tp mit ip oder von to mit t 3 entstehenden 
Reihen, so lassen die Gleichungen (107) sich schreiben: 

[ip'(t)T 0 \ to + Y vf 1 (to) u v + ( ' 2 = 0 

IS 

[lp'(t)T 0 ] to +Y^ ) (to)u v + = 0 

* , , , , , N , (107a) 

Dii w(t)^\t 3 + Y ^ ^ 2 = ^ 

is 

D>i W(t)T 3 \ t3 +Y^ ) (*3)^ + = J ?3 M) 

IS 

(M = 0, l,...n-l) , 




Investigations in the calculus of variations 



157 



and, for suitably determined values of the quantities 



\ _ \ _ _(m) 

T 0 ( £ oj — T 0 ! r 3 v £ 3j — r 3 ! 

(/x = 0, 1, . . . n — 1 ; u = 1,2,... 2n) , 



which have to become infinitely small together with e, must satisfy the system 
of equations (65), which can now be written in the form 



D^[ip(t + T 0 ,a + u)-ip(t)\ = 0 (t-t) 

[f> (t + r 0 , a + w) - Ip(t.)} = 0 
D^[ip(t + T 3 ,a + u})-ip(t)\=^ ) (t-t)" 

[if (t + T3, a + w) — ip(t)\ = ?73 M) 

(yu = 0, 1) , 

where, on account of (60), the quantities 

^ ( £ 3 - e) - <P (m) ( £ s) = ( £ 3 - e) - (t 3 ) , 

= 'lh ) {h ~ e) - ^ (/i) ( £ 3) = ( £ 3 - £) - ( £ 3) 



(107) 



(108) 



also become infinitely small together with e. 

For sufficiently small |tq|, \u> v \, however, the following expansion holds: 



(fi (t + To, a + lo) = tp'(t, o)tq + + (r 0 , w „) 2 

IS 

= ip'{t)r 0 + y>„(*R + (tq.^i/) 2 • 

IS 

Assuming that tq = ro(t) has the character of an entire function in the 
vicinity of t = to, we can differentiate this power series term by term, where 
the dimension of each term with respect to the totality of the quantities 
uv remains unaltered. The series thus obtained are unconditionally and 
uniformly convergent for sufficiently small values of these quantities in a 
domain \t — to I < <5 and represent the derivatives of the original series. If we 
treat the other series obtained by replacing by ip or to by t 3 , then we can 
write the equations (107) as follows: 

[ip'(t)T 0 ] to + (r 0 (Ai) ,w^ 2 = 0 

IS 

W{t)T 0 \ to +^2^f> {to)uj„+ (r 0 (M) ,w^ 2 =0 

JL , , / , , N , , (107a) 

D>1 [<P’(t)T3] t3 ( £ 3 )u v +(r^ ) =4" 

IS 

D M W{t)r s]t 3 + ^ (*3) ^ + ( T 3 (A1) > 2 = ? 73 M) 

(/x = 0, 1, . . . n — 1) , 




158 



Zermelo 1894 



wo nach der Leibnitzschen Formel 

D»[<p'(t)T 0 \ to = (f\ ^~ i+1) (to)r^ u.s.w. 

i = 0 ' ' 

gesetzt werden kann. Nun lasst sich nach einem bekannten Satze der Func- 
tionen-Theorie die Auflosung dieser 4n Gleichungen nach den 4 n Unbekann- 
ten (/ x = 0,1 .n— 1; v = 1,2,.. .2 n) fiir hinreichend kleine 

Betrage der rj ^ durch Potenzreihen, die nach ganzen positiven Po- 

tenzen dieser Grossen fortschreiten und mit ihnen zugleich unendlich klein 
82 werden, auf eine einzige Weise immer ausfiihren, vorausgesetzt, dass | die 
Determinante, gebildet aus den Coefficienten der Glieder erster Dimension, 
nicht verschwindet , d. h. dass 

0 = 0 (t 0 ,t 3 ) = 0 (t 0 ,t 3 ;a) (109) 

(i)^ (M_i + 1) (*o) 0 pi^ito) Vn+Ato) 

(i)V’ (M_i + 1) (*o) 0 >o 

0 tt)^- i+1 Hh) < 

0 (t 3 ) # } (t 3 ) ^„(t 3 ) 

wo die einzelnen Glieder immer ganze Systeme von je n 2 Elementen vertreten: 

H = 0, 1, 2, . . . n — 1 ; i, v = 1, 2, . . . n . 

Durch eine kleine Umformung des nach Fortlassung der Glieder hoherer 
Dimension von (107a) iibrig bleibenden Systemes lasst sich nun die Deter- 
minante auf eine iibersichtlichere Form und auf den Grad 2 n statt 4n redu- 
cieren. Dieses System namlich lasst sich schreiben, indem man zugleich den 
Buchstaben /r durch i ersetzt: 



b 

II 

o 

"3? 


<p't 0 + ^2 <p v u v 

IS 


0 


to 

o 

II 

b 


Ip' T 0 + ^ i’vUv 

IS 


0 


Q 

ii 


W't 3 + ^2 

IS 


3 


A) = D i 


ip'rz + ^2 'PvUv 

is 


3 



(»)^- i + 1 \t 0 ) 

(pyM-i + i)^) 

( 110 ) 

+ 1 )(t 3 ) 

+ 1) 



Multipliciert man hier beide Seiten von Ai) und ebenso von Bi) mit 
den dahinter stehenden Factoren und summiert alle diese Producte iiber i = 
0, 1, 2 , i-i und verfahrt dann ebenso mit C,) und Di ) und den zugehorigen 




Investigations in the calculus of variations 



159 



where, by Leibniz’s formula, we can set 

Dtx Vp'{t)To\t 0 = (Y) + (to)4 l) e.t.c. 

i = 0 ' ' 

Now, by a well-known theorem of the theory of functions, it is always possible 
to uniquely solve these 4 n equations for the 4n unknowns Tq^, t^\ 

(fi = 0, 1 , . . . n — 1; v = 1 , 2 ,... 2 n) for sufficiently small values of the £3^ , 
r)^ l> by means of power series that progress according to positive integral 
powers of these magnitudes and become infinitely small together with them, 
provided that the determinant, formed from the coefficients of the terms of 
first dimension, does not vanish , i.e., that 



O = O (t 0 ,t 3 ) = O (t 0 ,t 3 ;a) (109) 

+ (to) 0 V ( v\to) vX+Ato) 

+ {t 0 ) 0 V^+^o) > 

0 (^~ i + 1 Hts)^\ts)^Ut 3 ) K 

0 (t)^- i+1) {t 3 ) v&°(f 3 ) 

where the individual terms always represent entire systems of n 2 elements 
each: 



H = 0, 1, 2 , . . . n — 1 ; *, v = 1 , 2 , . . . n . 

By a minor transformation of the system that is obtained from (107a) 
by omitting the terms of higher dimension it is now possible to bring the 
determinant into a clearer form and reduce its order from 4n to 2 n. For we 
can write this system down by also replacing the letter // by i: 



Q 

II 

0 


<P't 0 + ^2 VvlVv 

V Jo 


Q 

II 

0 

of 


Ip'To + ^2 
; v Jo 


Q 

II 

"Xv? 

c 5 


w't 3 + ^2 VvU v 

- u J3 


A) rtf = D‘ 


ip'r, 3 + ^2 









( 110 ) 



Let us now multiply both sides of A;) and also of Bi) by the factors 
after them, and sum all these products over i = 0, 1, 2, . . . /i. And let us pro- 
ceed likewise with Ci) and Di) and the corresponding factors by simultane- 
ously considering the r/2 as the successive derivatives of functions £3 (t) 




160 Zermelo 1894 



Factoren, indem man zugleich die if) 3 \ was man immer kann, als die 
successiven Ableitungen von Functionen £ 3 (i) und rj 3 (t) fiir t = t 3 auffasst 
und die Beziehungen 

83 | ( IJ '^)^~ i+1) D i ^ = D^(iP'u) u.s.w. 

i = 0 ' ' 

benutzt, so erhalt man die folgenden Gleichungen: 



Ey) 0 = £>" 


Ip' ip' To - ip' Ip' To + ^2 (V’V*' ~ v'tyv) 


1 


Fy) D* [i p'(p 3 


V 

- 7s] 3 


0 


= 


Ip' ip' t 3 - p'ip'T 3 + ^2 {ip'Vv - v'lpv) Uv 

V 


3 



( 111 ) 



oder, durch Einfiihrung der Bezeichnungen: 

= D » [ip'^ 3 - <p'rfo] t3 , w v {t) = ip'{t)ip v {t) 
E v) 0 = ’52wi} l \to)uv 
FJ =Ew ( i r\t 3 )u v . 



<P'(t)lpu(t) , 



(111a) 



Fiir den Fall ip' {to) ^ 0 wird durch dieses Verfahren die Gleichung A^), 
die nur den Factor ip' {to) erhalt, durch Hinzufiigung der vorhergehenden A t ) 
{i < /i) sowie der Bp mit ihren Factoren, ersetzt werden durch Ey), wobei 
die Determinante sich nur mit ip'{to) multipliciert; ist aber p' (to) ^ 0 , was 
nach unseren Voraussetzungen notwendig eintreten miisste fiir ip'{to) — 0 , 
so wird statt dessen ganz analog B y) ersetzt werden durch Ey), wobei die 
Determinante sich mit —ip' {to) multipliciert. Ebenso kann Gy ) oder D y) 
durch Fy) ersetzt werden, je nachdem ip'(t 3 ) oder <p'(t 3 ) von 0 verschieden 
ist, und die Determinante andert sich dabei um den Factor ip'(t 3 ) oder —p(t 3 ). 
Wird dieses Verfahren successive angewandt fiir /./, = n — 1, n — 2, ... 2, 1, 0 , 
wobei immer wegen i ^ p die einmal umgeformten Gleichungen Ay), B y), 
Cy), Dy) niemals zum zweiten Male verwendet werden, so wird im Falle 
tp'{t 0 ) ^ 0, ip'(t 3 ) ^ 0 das ganze System 

Ay), By), Cy), dj 

ersetzt werden durch: Ey) , B y) , Fy) , Dy) 

mit der neuen Determinante: 



= ±ip'(to) n ip'(t 3 ) n 6(t 0 ,t 3 ) 



84 



| und analog in den iibrigen Fallen. Da aber Ey) und Fy ) die Unbekann- 
ten Tq \ t 3 gar nicht mehr enthalten und t^ nur noch in B /i ), t 3 ^ nur 




Investigations in the calculus of variations 



161 



and 773(f) for t = to, which is always possible. Then, by use of the relations 
(^^~ l + 1 ) D l n = D^(^'u) e.t.c. , 

i = 0 

we obtain the following equations: 



EJ 0 = £>** 


Ip' ip' To - p' Ip' To + ^2 (ip'ipv - ip'lpv) Uv 


\ 

5 


F„) D» [ip'io 


V 

- v'm\o 


0 


= 


Ip' ip' To - ip' Ip' To + ^2 WVv - v'lpv) Uv 

V 


3 



or, by introduction of the denotations 



(111) 



= 3 _ y'n 3 ] ta , Wv (t) = ip'(t)<p„(t) 

E ^) 0 = J 2 w l' J '\t 0 )u> l , 

F,) = f>£' l) (t 3 R. 






(111a) 



When ip' (to) ^ 0, the equation A^), which only contains the factor ip' (to), 
will be replaced by E^) according to this procedure, through addition of the 
preceding A t ) (i < //) as well as of the Bi) together with their factors, where 
the determinant is only multiplied by ip' (to)', but if p' (to) S; 0, which, by our 
assumptions, should be the case when ip' (to) = 0, then, instead, B M ) is re- 
placed by E p) along the very same lines, where the determinant is multiplied 
by —ip' (to). Likewise, it is possible to replace C fl ) or D^) by F M ), depending 
on whether ip' (to) or <p'(t 3) differs from 0. In those cases, the determinant is 
altered by the factor ip' (to) or —ip(to)- Applying this procedure successively 
for fi = n — 1, n — 2, . . . 2, 1, 0, where, on account of i ^ p, the equations A^), 
Bfj,), Cfj), D^) can never be used again once they have been transformed, 
then, for ip' (to) ^ 0, ip' (to) ^ 0, the entire system 

R), B m ), C„), R) 

is replaced by R) , R) , F M ) , R) 

with the new determinant 



= ±ip'(to) n ip'(to) n O(to,t 3 ) 



and the same holds for the remaining cases. But since E^) and F^) no longer 
contain the unknowns Tq \ and since t ^ 1 and only still occur in B M ) 




162 Zermelo 1894 



85 



noch in D erscheint, so zerfallt die neue Determinante in ein Product: 



±0rP'(t o ) n rp'(t 3 ) n 

0 o) 



,.(/d 

J n + v 



(to) 



i + 1) (^o) 0 V^Oo) i’n + vito) 

0 0 wi^ (t 3 ) wl?l „(t 3 ) 






1.(0 



= ±l(^)v^- i+1) (*o)l • l(^)v^- i+1) (*3) 



wl^ito) u>i+„(to) 

(* 3 ) 



$°(f 3 ) 



"n+i 

^n+iz 



wo die beiden Determinanten n ten Grades sich wegen ( ,J ) = 0 (/j, < i) auf ihre 
Diagonalglieder rp’(to) n und ip'(t 3 ) n reducieren und sich gegen die Factoren 
links aufheben, so dass schliesslich: 

0(t o ,t 3 ;a) = ± Iwi^ito) wl^\t 3 )\ 



( 112 ) 



(/r = 0, 1, . . . n — 1; v = 1,2,... 2 n) , 



wo zuletzt v den Colonnenindex und jedes hingeschriebene Glied das System 
von n 2ngliedrigen Colonnen bedeutet, und 

w v (t) = ip' (t)ip' (t) rp v {t) . 



Dieselbe Formel (112) ergiebt sich auf ganz analogem Wege, wenn eine der 
Grossen ip'(to) und il>'(t 3 ) verschwindet und daftir sicher p'(to) oder tp'(t 3 ) 
von Null verschieden ist. 

Wir konnen immer annehmen, dass 0(t 3 ,t) nicht fiir alle Werte t er- 
nes endlichen Intervalles und daher, wenn F und somit auch ip, ip, 0 als 
„analytische“ Functionen vorausgesetzt werden, fiir das game particulare In- 
tegral a der Differentialgleichung bestandig verschwindet. Sonst miisste es 
namlich 2 n Constanten uii, . . . W 2 n geben, fiir welche iiberall 

2 n 

I ^ (Jj v w v{t) = 0 

I/ = l 



wird, falls nicht alle Determinanten ?zten Grades des Systemes w[?\t o) (/i = 
0, l,...n — l;i/ = l,2, ... 2 n) verschwinden. Diesen Ausnahmefall kann man 
aber vermeiden, indem man to durch einen beliebig nahen Punkt t' 0 ersetzt, 



„(m) 



(t) 



in einem ganzen Intervall 



oder es miissten alle diese Determinanten 
verschwinden, und daraus folgte dann wieder eine Relation der behaupteten 
Form. Es ware also immer: 



^ (ip 1 (t)(p„{t) - (p'(t)rp v (t)) 

V 

= ^'(t) 22 <p„(t)u} v - 22 4>v(t)uv = o . 

V V 




Investigations in the calculus of variations 



163 



and D^) respectively, the new determinant decomposes into a product: 

±0iP'(t o ) n ip'(t 3 ) n 

0 0 wl^ (t 0 ) w^l „(t 0 ) 

__ (t)^- i + 1) (to) 0 ^\t 0 )^lAto) 

0 0 wi^{tz) w^l u (t 3 ) 

0 (f)lp^~ i + 1) (t 3 ) Ipnlvfo) 






wi^ito) W^n+vito) 

wi^(t 3 ) W^(t 3 ) 



where the two determinants of n th order are reduced to their diagonal terms 
ip'(to) n and ip'(t 3 ) n on account of ( /J ) = 0 (/x < i) and canceled out by the 
factors on the left side so that finally 



0(to, t 3 ; a) = ± wl^ (t 0 ) wi^ (t 3 ) (112) 

(fj, = 0, 1, . . . n — 1; v = 1, 2, . . . 2 n) , 



where v denotes the column index and every entered term the system of n 
columns of 2 n terms, and 



w v (t) = ip'(t)<p'(t) ip v (t) . 



The same formula (112) is obtained along the same lines when, on the 
one hand, one of the quantities ip' (to) and ip'(t 3 ) vanishes and, on the other 
hand, if' (to) or < p'(t 3 ) is different from zero. 

We may always assume that 0(to, t) does not always vanish for all values t 
of a finite interval and hence, provided F and thus also ip, ip, 0 are taken to 
be “analytic” functions, for the entire particular integral a of the differential 
equation. For otherwise there would have to be 2 n constants u> i , . . . u> 2 n such 
that everywhere 

2 n 

T, w v w v (t) = 0 , 

!/= 1 



provided that not all determinants of nth order of the system w ( f 1 ' > (to) (p = 
0, 1, . . . n — 1; v = 1, 2, . . . 2 n) vanish. But we can avoid this exceptional case 
by replacing to by an arbitrarily close point t' 0 , or all these determinants 
wi^ (t) would have to vanish in an entire interval, from which, in turn, a 
relation of the asserted form would follow. Hence we would always have 

y{t) ~ <p'(t)ip u (t)) 

= 1p'(t) ^ <Pv(t)Uv - <fi'(t) ^ = 0 • 




164 Zermelo 1894 



Da aber hier nirgends die ip v {t), ipv{t) unendlich werden oder die <p'{t), 
ip' {t) gleichzeitig verschwinden sollen, so wird die Function: 

= = (f] 

M) nt) T{> 

im ganzen Intervall endlich bleiben, so dass man schreiben kann: 

<p{t, a + ecu) = ip(t ) + £ ^2 + (e) 2 

V 

= p{t) + £T(fi'(t) + (e) 2 = ip(t + st) + (e) 2 , 
ip{t, a + ew) = ip{t) + £Tip'(t) + (e) 2 = ip(t + £t) + (e) 2 . 

Nun ist aber der Ansatz 

x = p{t + £t) , y = ip{t + £T ) 

nur eine andere Darstellung der Curve a: 

x = <p(t) , y = ip{t) ; 

es gabe also eine durch u v = a v + eui v charakterisierte Losung der Diffe- 
rentialgleichung, die sich fiir kleine £ von a selbst nur um Glieder hoherer 
Dimension unterschiede, oder, anders ausgedriickt: in a fielen zwei unendlich 
benachbart.e Losungen der Differentialgleichung zusammen, a ware also eine 
singulare Curve aus der Schar der Integrate und miisste besonders untersucht 
werden. 

86 Schliessen wir diesen Fall aus, so muss die Gleichung | 0{to,t;a) = 0, 
wenn sie iiberhaupt bestehen kann, eine kleinste Wurzel t = t' 0 > to besitzen. 
Dann mogen nach der Bezeichnung des Herrn Prof. Weierstrass die beiden 
durch t = to und t = t r 0 definierten Stellen der Curve zwei „auf a conjugierte 
Punkte u genannt werden. 

Fur ein solches Paar conjugierter Punkte to, to = t' 0 miissen sich wegen 
0 = 0 die linearen Gleichungen (110) befriedigen lassen fiir = 0, r ^ = 0, 
so dass fiir gewisse Werte der Grossen t^, u) v , die nicht samtlich ver- 
schwinden, die Gleichungen (107) die Form annehmen: 

Zte [ip (t + £T 0l a + eu)] to = <p { ^ {to) + (e) 2 

[ip {t + £T 0 ,a + ew)] to = {to) + 0^2 

[ip {t + £T 3 ,a + £w)] t3 = <p ( m) {t 3 ) + (e)2 

D 11 [ip {t + £T 3 ,a + ew)] t3 = ip(ri (i 3 ) + {e) 2 

(M = 0, l,...n-l) 



( 113 ) 




Investigations in the calculus of variations 



165 



But since, in this case, neither the <p v (t), ip v {t) are ever supposed to 
become infinite nor the p'(t), ip'(t) to vanish simultaneously, the function 






= r(t) 



v'(t) 

remains finite in the entire interval, so that we can write 



ip(t, a + £oj) = ip(t) + e ^2 + (e) 2 

V 

= (p(t) + £Tip'(t) + (e ) 2 = <p(t + et) + (e) 2 , 
ip(t, a + eui) = %p(t) + £Tip'(t ) + (e ) 2 = ip(t + et) + (e ) 2 . 

Now the ansatz 



x = <p(t + et) , y = ip(t + et) 
is but a different representation of the curve a: 

x = <p(t) , y = ip(t) ; 

and hence there would be a solution of the differential equation character- 
ized by Uv = a v + £w„ which, for small e, differs from a itself only by terms 
of higher dimension, or, put differently: two infinitely neighboring solutions 
of the differential equations would coincide on a. Hence a would be a sin- 
gular curve from the family of the integrals and would require a separate 
investigation. 

If we exclude this case, then the equation 0{to , t; a) = 0, provided that it 
can obtain at all, must have a smallest root t = t' 0 > to- Then, following the 
terminology used by Prof. Weierstrass, the two positions of the curve defined 
by t = to and t = t' 0 shall be called two “ points conjugate on a”. 

Given such a pair of conjugate points to, to = t' 0 , the linear equations (110) 
must be capable of satisfaction on account of <9 = 0 for = 0, ryf l> = 0 so 
that, for certain values of the quantities t^\ t^\ oj v not all of which vanish, 
the equations (107) take the form 

[ip (■ t + £To, a + ew)] to = p M (t 0 ) + (e ) 2 
DV [if (t + £To, a + £u)\ to = ^ (t-o) + (e) 2 
[ip ( t + £T 3 ,a + ew)] i3 = p^) (t 3 ) + ( e ) 2 
[ip (t + er 3 , a + £w)] ta = ip (p,) {t 3 ) + (s ) 2 
(m = 0, l,...n-l) 



( 113 ) 




166 Zermelo 1894 



fur beliebig kleine e, die unter einer gewissen Grenze liegen. Dann ist nach 
(21) auch fiir alle „Osculations-Invarianten“ $ (x^ , bis zur n— lten 
Ordnung: 

<P ((p^(to + er 0 , a + eui) , (to + £t 0 , a + ew)) 

= +(e ) 2 

. , , , , . (113a) 

$((pW(t 3 +£T 3 ,a + £w),V’ ( ' i) ( i 3 + £t 3 , a + £ 0 j)) 

= $ (v? (p) (G ),V (/i) (G)) + (e ) 2 , 



was man so ausdriicken kann: es giebt eine zu a unendlich benachbarte Lo- 
sung der Differentialgleichung des Problems (u„ = a„+£w„ fiir ein unendlich 
kleines e), welche a in den beiden Punkten t = to und t = t 3 von n — 1 ter Ord- 
nung beriihrt, eine Eigenschaft, die als Definition der „conjugierten Punkte“ 
angesehen werden kann, wenn sie in dem hier entwickelten Sinne verstanden 
wird. Oft giebt es wirklich eine Schar von reellen Losungen u, die sich an a 
beliebig eng anschliessen und diese Curve ausser in 0 immer noch in einem 
zweiten Punkte 3 von n — 1 ter Ordnung beriihren, dann wird die Grenzlage 
87 dieser | Punkte 3, wenn man u mit a zusammenfallen lasst, durch den zu 0 
„conjugierten“ Punkt O' dargestellt. Doch braucht dieses Verhalten nicht fiir 
jedes Paar conjugierter Punkte stattzufinden, worauf aber hier nicht weiter 
eingegangen werden soil. 

Auch die urspriinglichste und gebrauchlichste Definition der „conjugierten 
Punkte 11 durch die Existenz gewisser Integrale einer linearen Differentialglei- 
chung (SG = 0) soil hier ausser Betracht bleiben, da sie fiir die vorliegende 
Untersuchung ohne wesentliche Bedeutung ist. 

Sind nun 0 und 3 nicht zwei auf a conjugierte Punkte, also 



±0(to,t 3 ;a) > C > 0 , (114) 

so kann man positive Grossen g ' so klein angeben, dass fiir 

< 5 ^ (/x = 0, 1, . . . n — 1) (115) 



Au) 


^ f 


„(aO 


?3 







die Gleichungen (107) in der dort angegebenen Weise durch Potenzreihen- 
Entwicklung nach den Unbekannten t- 3 ' > , ui v aufgelost werden konnen, 
wahrend die Betrage dieser Grossen mit den g ' gleichzeitig unendlich klein 
werden. Das so gefundene Werte-System ist aber auch die einzige Losung, fiir 
welche die Betrage der Unbekannten unter eine gewisse Grenze a' fallen, ja 
die einzige, fiir welche nur |w„| < a ^ a’ {v = 1 , 2 , . . . 2 n) sein soil, ohne dass 
die Tof 1 ' 1 von vorn herein irgendwelchen Beschrankungen unterworfen 

wiirden. Denn den Bemerkungen von (63) zufolge konnen allein durch hinrei- 



chende Verkleinerung der g' und \u> v \ oder der g' und a auch die 



r (M) 



r-G) 



immer beliebig klein, also auch < a' gemacht werden, vorausgesetzt, dass a 
sich im betrachteten Intervall nicht selbst durchschneidet. Lasst man jetzt t 3 
das Intervall U = = G durchlaufen, in welchem alle unsere Vorausset- 




Investigations in the calculus of variations 



167 



for arbitrarily small e below a certain limit. Then also for all “osculation 
invariants” <P (x^ , y^) up to the n — 1 th order, by (21), 

<P + £T 0 , a + eu>), (to + £To, a + ecu)) 

= $(<p M (t 0 ),il)M(t 0 )) + (e ) 2 

... . , , (113a) 

yp^’(t 3 + £T 3 ,a + £w),-i/uW(f 3 _|_ £T3) a + £L0 )j 

= $(ipM(t 3 )^M(t 3 )) + ( £ ) 2 , 

which may be expressed as follows: There is a solution of the differential 
equation of the problem (u u = a„ + ew„ for an infinitely small e) infinitely 
neighboring a that has contact of n — 1th order with a at the two points t = to 
and t = f 3 . This property can be regarded as a definition of the “conjugate 
points”, provided it is understood in the sense developed here. There often 
really is a family of real solutions u that follow a arbitrarily closely and that 
still have contact of n— 1 th order with this curve at a second point 3 besides 0. 
In this case, the limit position of these points 3 is represented by the point O' 
“conjugate” to 0, if we let u coincide with a. That this behavior, however, 
need not arise for every pair of conjugate points will not be discussed any 
further here. 

Nor shall we discuss the original and most common definition of “conjugate 
points” in terms of the existence of certain integrals of a linear differential 
equation ( SG = 0), since it has no real bearing on our present investigation. 
Now if 0 and 3 are not two points conjugate on a, hence 

±e(t 0 ,t 3 ;a) >C>0, (114) 

then we can take positive quantities g'^ sufficiently small so that, for 

£ 3 M) < 3^ . < 9 ^ (p = 0, 1, . . . n - 1) , (115) 

the equations (107) can be solved in the way specified there for the un- 
knowns Tq M \ t^\ by power series expansion, where the values of these 
quantities become infinitely small together with the g'. The value system 
thus found is also the unique solution for which the values of the unknowns 
are below under a certain limit a' , and even the unique solution for which we 
should only have \uj v \ < a 5= a' (y = 1, 2, . . . 2 n) without imposing some re- 
strictions on the Tq M \ from the outset. For, according to the observations 
concerning (63), we can make the , r ^ arbitrarily small, and hence 

also < a' , only by making the g ' and \oj v \ or the g ' and a sufficiently small, 
provided that a does not intersect itself in the interval under consideration. 
If we now let f 3 pass through the interval t\ ^ f 3 ^ t 2 , in which our assump- 
tions are all valid without exception, then the f belonging to the individual 




168 Zermelo 1894 



zungen ausnahmslos giiltig sind, so konnen die zu den einzelnen Punkten ^3 
gehorigen £ durch ihre constante untere Grenze und ebenso die g ' und a 
durch ihre oberen Grenzen im Intervall ersetzt werden. 

Die hier entwickelte Auflosung der Gleichungen (107) ist immer anwend- 
bar, wenn die beliebige Grossen sind, die den Bedingungen (115) 

genii gen. Haben sie aber die Bedeutung (108) 

£3^ = V { 1 ] (*3 - e) - V { 1 ] (is) , ??3 M) = (*3 - e) - (h) , 

| so kann (115) und damit auch (107) fiir ein hinreichend kleines positives £0 
durch s < £q immer befriedigt werden. 

Dann giebt es eine eindeutig bestimmte und an a sich stetig anschliessende 
Schar U von Losungen der Differentialgleichungen G = 0, welche fiir das 
Stiick der Curve a vom Punkte 3 an bis 4 die Coordinaten: 

x = - e) = <^(A - t) , y = ip^t 3 = e) = ip(X - l) 

und die im Anfang cles Abschnittes geforderten Eigenschaften besitzt. 

Dieselben Betrachtungen wie fiir a gelten auch fiir jede andere Losung u 
der Differentialgleichung, x = ip(t, u), y = ip(t,u), welche denselben Bedin- 
gungen wie a geniigt. So lange also 0(t' o , t' 3 , u) ^ 0 bleibt, wenn mit t' 0 und t' 3 
die zu den Endpunkten 0 und 3 gehorigen Werte von t bezeichnet werden, 
und die Curven u fortfahren, sich zwischen 0 und 3 in der angegebenen Weise 
regular zu verhalten, kann die Schar U, und zwar nur auf eine einzige Weise, 
langs der Curve a stetig fortgesetzt werden. Die zu 0 „conjugierten“ Punkte 3 
der Curven u, in denen die eindeutige stetige Fortsetzung der Schar im allge- 
meinen ein Ende findet, und nach dem Friiheren zwei unendlich benachbarte u 
einander von n — 1 ter Ordnung beriihren, konnen als „Verzweigungsstellen“ 
der Schar U aufgefasst werden. 

Von der Curve a werde nun vorausgesetzt, class sie sich im Intervall 
ti 51 t ^ t 2 (einschliesslich der Grenzen) regular verhalte und keinen zum 
Anfangspunkt t = t\ conjugierten Punkt t = t[ ^ t, 2 besitze, so dass die 
Determinante 0(ti,t) immer von Null verschieden ist fiir t\ < t ^ t -2 und nur 
fiir t = t\ verschwindet. Dann kann aber wegen der Stetigkeit der Function 
0(t' , t") immer eine Grosse to < C so nahe an ti und damit auf a ein Punkt 0 
so nahe vor 1 angenommen werden, dass das regulare Verhalten von a sich 
auch auf das grossere Intervall f 0 ^ t ^ t2 erstreckt, gleichzeitig aber 0(to, t ) 
im ganzen Intervall t\ . . A 2 nirgends, auch an keiner der Grenzen mehr, ver- 
schwindet, der absolute Betrag also eine positive untere Grenze C, besitzt: 

|6>(f 0 ,f)| > C > 0 (ti^f^t 2 ). (116) 

Mithin konnen nach den friiheren Betrachtungen positive Con- 1 stanten a, 
g ' so klein angenommen werden, dass die Gleichungen (107) unter den Be- 
dingungen (115) fiir ti ^ t3 ^ t 2 immer und nur auf eine einzige Weise durch 




Investigations in the calculus of variations 



169 



points £3 can be replaced by their constant lower limit , and likewise the g ' 
and a by their upper limits in the interval. 

The solution of the equations (107) developed here is applicable whenever 
the r/ 3 ^ are arbitrary quantities satisfying the conditions (115). But 
when they have the meaning (108) 

(*3 - e) - {h) , (h - e) - ^ {h) , 

then (115), and hence also (107), can always be satisfied by e < £$ for a 
sufficiently small positive Co- 
in this case, there is a uniquely determined family U continuously follow- 
ing a of solutions of the differential equations G = 0 which for the segment 
of the curve a from point 3 up to point 4 has the coordinates 

x = - e) = ip ( A - i) , y = ^i(*3 = e) = V>(A - t) 

and which possesses the properties called for at the beginning of this section. 

The considerations valid for a are also valid for any other solution u 
of the differential equation, x = ip(t,u), y = ip(t,u), that meets the same 
requirements as a. Hence, as long as still 0(f' Ol t3,u) ^ 0, where t' 0 and <3 
denote the values of t belonging to the endpoints 0 and 3 and as long as 
the curves u continue to be regular between 0 and 3 in the specified way, 
the family U can be continuously continued along the curve a in one way 
only. The points 3 of the curves u which are “conjugate” to 0 and in which 
the unique continuous continuation of the family usually ends, and at which, 
according to what was said above, two infinitely neighboring u have contact of 
the n — 1 th order with one another may be considered “branching positions” 
of the family U. 

Now suppose that the curve a is regular in the interval ti t t-z 
(including the boundaries) and possesses no point t = t\ t% conjugate to 
the starting point t = t\ so that the determinant 0{t\,t) always differs from 
zero for t\ < t ^ tz and only vanishes when t = t\. But then, on account of 
the continuity of the function 0(t', t"), we can always take a quantity to < t\ 
close enough to t\, and hence a point 0 close enough in front of 1 on a so 
that the regularity of a also extends to the greater interval to ^ t ^ <2, while, 
at the same time, O(to,t) does not vanish anywhere in the entire interval 
ti . . . t2, including even its boundaries, and the absolute value thus possesses 
a positive lower limit f: 

\0{t o ,t)\ > C > 0 (ti ^ t ^ t 2 ) . (116) 

Therefore, according to the previous considerations, we can take positive 
constants a, g ' sufficiently small so that, under the conditions (115), the 
equations (107) are always and only in one single way satisfied by values of 




170 Zermelo 1894 



Werte der Unbekannten Tq^, t^\ mit der Nebenbedingung |w„| < a be- 
friedigt werden, dass ferner die Curven x = ip(t, a + u>), y = ip(t, a + w) fiir 
to + To ^ t ^ to + To sich ebenso wie a regular verhalten und dass endlich 
immer 



±0{t o +T 0 ,t 3 +T 3 ',a + oj) > 0 (117) 

wird. Diese Gleichungen (107) konnen fiir: 

A) - (to ) , = D^( A) - ^ (to ) , 

also nach (115) fiir: 

\D^(X)-^\to)\<g^g^ 

, _ / N , (HO) 

\D^(X)-^(to)\<g^g', 

(A* = 0, 1, . . . n — 1) [vergl. (37)] 



in der Form geschrieben werden: 



[D»ip(t + r 0 , a + w)] to =(p^(t 0 ) 
[D^ip(t + t 0 , a + w)] to = (to) 
[D ll ip(t + To, a + w)] ts = D^( A) 
+ r 3 , a + w)] ts = £>^(A) 
(M = 0, l,...n-l) , 



(119) 



wo alle Bezeichnungen ihre friiheren Bedeutungen haben. Dann driicken sie 
aus, dass das particulare Integral u: 

x = ip(t, a + ui) , y = ip(t , a + u>) , 

einerseits mit a im Punkte 0 (x = ip(t o), y = i/j(to)), andrerseits aber im Punk- 
te 3: 



x = <p(to + To, a + uj) , y = ip(t 3 + r 3 , a + u) , 

mit einer zweiten Curve a (x = <p(A). y = if>( A)) eine Beriihrung von n — 1 ter 
Ordnung eingeht. Wenn also a im ganzen Intervall Ai ^ A ^ A 2 iiberall 
90 fiir passende Werte der to zwischen t\ und G | den Bedingungen (118) in 
der Bedeutung von (37) geniigt, so bilden die so bestimmten Curven u eine 
Schar U mit den samtlichen im Anfang des Abschnittes gefordert.en Eigen- 
schaften. Denn sie miissen sich auch stetig an einanderschliessen, weil die 
Auflosung der Gleichungen (119) unter der Bedingung \uj v \ < a eine eindeu- 
tige ist und daher die „stetige Fortsetzung 11 der Schar, die wegen (117) immer 
Fig. 6 moglich ist, niemals zu anderen Ergebnissen fiihren kann. 




Investigations in the calculus of variations 



171 



the unknown oj v under the ancillary condition |w„| < a, for fi ^ 

t 3 ^ t 2 , and so that, furthermore, the curves x = ip(t, a + u), y = ip(t, a + Lo) 
are regular for to + T o = t ^ t 3 + T 3 just like a, and so that, finally, always 

±6>(t 0 + r 0 ,t3 +r 3 ;a + w) > 0 . (117) 



Thus, when 



= D^lp(X) - <pM ( t3 ) , = D^( A) - ^ (ts) , 

and hence, by (115), when 

|D^(A)-y»(f 3 )| 

|^(A)-^)(t 3 )| 

(p = 0, 1, . . . n — 1) [comp. (37)] 



these equations (107) can be written in the form 



[D^ipft + r 0 , a + w)] tQ = (t 0 ) 

[D»ip(t + T 0 ,a + uj)] to = ipM (t 0 ) 
\D^ip(t + 7 - 3 , a + w)] ts = D^Tp(\) 
[D^ip(t + r 3 , a + w)] ts = A) 
(/x = 0, l,...n- 1) , 



(119) 



where all denotations have their previous meaning. They then express the 
fact that the particular integral u: 



x = ip(t, a + lo) , y = ip(t , a + w) , 



on the one hand, has contact of n — 1 th order with a at the point 0 
(x = ip(to), y = ip(to)) but, on the other hand, with a second curve a 
(x = <p(X), y = at the point 3: 



x = ip(t 3 +r 3 ,a + uj) , y = ip(t 3 + r 3 ,a + u>) . 

Hence, if a satisfies the conditions (118) in the sense of (37) everywhere in the 
entire interval Ai ^ A ^ A 2 for suitable values of the t 3 between t,-\ and t^, then 
the curves u so determined form a family U possessing all properties called 
for at the beginning of the section. For they also must continuously follow 
one another, since the solution of the equations (119) is unique under the 
condition |w„| < a, and hence the “continuous continuation” of the family, 
which, on account of (117), is always possible, can never lead to different 
Fig. 6 results. 




172 



Zermelo 1894 




Wenn ferner die Curven a und a in den Endpunkten 1 und 2 einander von 
n — l ter Ordnung beriiliren, so miissen auf Grand derselben Eindeutigkeit 
die zu diesen Endpunkten gehorigen Curven u mit a selbst zusammenfallen, 
sodass nunmehr der Satz VII angewandt werden kann. 

1st namlich gleichzeitig fiir ein constantes positives g bestandig 

Ei (a M\ y M. y(">) > g > 0 (120) 

(a>) = <p M {t),yM = ip^(t),ti ^t^t 2 ) 

und fiir willkiirliche Werte der Variablen , y^ n \ wo E\ die Bedeutung (78) 
besitzt, so konnen in (118) oder in (37) die Constanten g M so klein gewahlt 
werden, dass auch iminer 

Ei(X) = y (n) ) > 0 

und somit auch E( A) = E (x ( ai) , y ^ ] ; x {n) , ^ 0 

ist fiir: 



x (At) = D^^(t + r 3 , a + u)t 3 , y (ai) = + r 3 , a + w) <3 

in der Bedeutung (119) und fiir beliebige x^ n \ y^ n \ so dass nach VII dem 
Stiick 1 2 von a ein wirkliches Minimum des Integrates in einer „Nachbar- 
schaft. n — 1 ter Ordnung “ ent.spricht. 

91 | Ist aber statt dessen nur bestandig 



Fi (x^\ y^) > g > 0 (121) 

(a>) = ^\t), y ( E> = ipW(t), ti ^ t ^ t 2 ) , 



Ei in der Bedeutung (19) genommen, so konnen durch Hinzufiigung der neuen 
Bedingungen 



D n 7p(\) - < g n 

D n ^{X)-^ n \t 3 )\<g n 



auch die Grossen 

x (n) _ x ( n ) 

\y {n) ~y (n) 



I D n (p{\) - D n ip(t + T 3 ,a + u )) t3 1 < g' n , 
| D n ip(\) - D n ip(t + r 3 , a + w) ts | < ^ 




Investigations in the calculus of variations 



173 




If, furthermore, the curves a and a have contact of n — 1 th order at the 
endpoints 1 and 2, then, on account of that very uniqueness, the curves u 
belonging to these endpoints must coincide with a itself so that it is now 
possible to apply Theorem VII. 

For if, at the same time, we always have, for some constant positive g, 

E x *("), y^) > g > 0 (120) 

(a>) =<p M (t),yW ^t^t 2 ) 

and for arbitrary values of the variables x^ n \ y^ n \ where E\ has the sense 
of (78), then, in (118) or in (37), we may choose the constants g M so small 
that always also 

Ei(X) = E 1 (x^\y^-,x { - n \yW) > 0 
and hence also E( A) = E (^x^, y^ ;x^ n \y^ n ^ ^ 0 
for 



X (E = + r 3 , a + w) ta , y + r 3 , a + u>) t3 



in the sense of (119) and for arbitrary x^ n \ y^ n \ so that, by VII, there 
corresponds to the segment 1 2 of a a real minimum of the integral in a 
“vicinity of n — 1 th order”. 

But if, instead, always only 

F 1 ( X M,yM)> e>0 ( 121 ) 

(a>) = y M = ip w {t), h^t^t 2 ) , 

where F\ is taken in the sense of (19), then, by adding the new conditions 
\D n tp(X) — <p( n \t 3 )\ < g n 
\D n <f(X) - ^ n \t 3 )\ < g n , 
we can also make the quantities 



x ( n ) _ x (n) 

y (n) - y (n) 



\D n tp(X) - D n <p(t + T 3 ,a + u ) t3 1 < g' n , 
\D n if>( X) - D n ip(t + r 3 , a + w)t 3 1 < g' n 




174 Zermelo 1894 



so klein gemacht werden, dass nach (78) wieder 

£i x {n \ y (n) ) > 0 wird fiir 

xW = D»ip(t + r 3 , a + oj ) t3 , y (/i) = D>*ip(t + r 3 , a + u) H 
{n = 0, 1 , ... n) und x {n) = D n p{ A) , y {n) = D>( A) , 

dass also auch E( A) ^ 0 wird und somit a wieder ein Minimum des Integrates 
liefert, jetzt aber in einer „Nachbarschaft n ter Ordnung“. 

Die aus (120) und (121) gezogenen Schlussfolgerungen bleiben giiltig, 
wenn hier die positive Grosse g durch 0 ersetzt wird, d. h. auch wenn E\ 
oder F\ in einzelnen Punkten von a verschwindet, ohne jedoch, als Function 
von x^\ y^ betrachtet, an einer dieser Stellen sein Vorzeichen zu wechseln, 
also auch fiir alle Werte-Combinationen der Umgebung immer positiv bleibt. 

Dagegen ist noch zu untersuchen, ob nicht der andere Factor (k—k) 2 von E 
in (78) und damit auch E selbst auf a bestdndig verschwinden konne; dann 
wiirde namlich J(a) = J{a) werden, weshalb auch dieser Fall im Satze VII 
ausdriicklich ausgeschlossen wurde. Es wird sich aber zeigen, dass er unter 
den bereits gemachten Voraussetzungen nur fiir a selbst eintritt. 

Ist namlich in (78) 

k-k = y'(x {n) - x (n) ) - x'(y {n) - V {n) ) = 0 , ( 122 ) 

92 | so ist wegen x' 2 + y' 2 > 0 

X (") — a;(n) y( n ) — y( n ) 

7 = T = h 

x' y' 

eine endliche Grosse, und daher ausser (119) noch: 

*(") = D n ip{t + T 3 , a + oj)t 3 = x ^ — hx' = D n Jp{\) , 

2 / ( ") = D n <f(t + r 3 , a + cu) t3 = yW - hy' = D n ff( A) , 

wenn gemass den Bemerkungen von S. 56 die in (119) gar nicht vorkommende 
und in Bezug auf den Wert von E willkiirliche Grosse A ^ durch A — h\' 
ersetzt wird. Das heisst aber: Die beiden Curven a und u gehen im Punkte 3 
eine Beriihrung von n ter statt. nur von n — 1 ter Ordnung mit einander ein. 

Dann kann die Curve a in der Umgebung des Beriihrungspunktes 3 
(t = t 3 ) in der Form ausgedriickt werden: 

x = Tpi(t), y = ipi(t) , so dass 

Pi M) (*3) = , ip^it's) = if^\t' 3 ,u) (123) 

(M = 0, l,...n) . 

Da aber nach (117) auf a auch immer \0{t' o , t ' :i ; u) | > 0 angenommen wer- 
den kann, wo der Wert t' 0 zum Punkte 0 gehort, so sind auch die Formeln (107) 




Investigations in the calculus of variations 



175 



so small that, by (78), again 

E 1 (ajM.j/M; T (n) , y (n) ) > 0 for 

2,0) _ + r 3 , a + w) t3 , y ^ = D^ip(t + r 3 , a + w) t3 

{y = 0, 1, . . . n) and x (n) = £>"^(A) , y (n) = D n fp( A) , 

and thus also i?(A) St 0, and hence that a furnishes again a minimum of the 
integral, but now in a “vicinity of n th order". 

The consequences deduced from (120) and (121) retain their validity if 
the positive quantity g is replaced here by 0, i. e., even if E\, or F\, vanishes 
in individual points of a, without, however, when considered as a function 
of x^\ y^\ switching its sign at one of these positions, and hence always 
remains positive for all combinations of values in the vicinity. 

By contrast, we still need to investigate whether it is not possible that the 
other factor (k—k) 2 of E in (78), and hence also E itself on a, always vanishes', 
for, in this case, we would have J(a ) = J(a), which is the very reason why 
this case was expressly excluded in Theorem VII. But, as we shall see, this 
case only arises for a itself under the assumptions already made. 

For if in (78) 

k - k = y'(x^ n) - x^ n) ) - x'(yW - y (n) ) = 0 , (122) 

then, on account of x' 2 + y' 2 > 0, 

x( n ) — x (n) y{n) _ y(n) 

x' y' 

is a finite quantity, and hence, besides (119), also 

x {n) = D n ip(t + 73, a + u>)t 3 = x — hx = D n tp( A) , 

2/ (n) = + r 3 , a + u ) t 3 = y {n) - hy' = D n fp(X) , 

if, according to remarks on p. 56, the quantity X^ n \ which occurs nowhere 
in (119) and is arbitrary with respect to the value of E, is replaced by A*-") — 
h.X' . But this means that the contact that the two curve a and u have with 
one another at point 3 is of n th, instead of only n — 1 th, order. 

The curve a can then be expressed as follows in the vicinity of the point 
of contact 3 (t = t' 3 ): 

x = , y = , so that 

^(t' 3 ) = ¥> (m) (*3>w) , (123) 

(y = 0, l,...n) . 

But since, by (117), we may always also assume \0(t' o , t' 3 ; u)| > 0 on a, 
where the value t' 0 belongs to point 0, the formulas (107) of the “continuous 




176 Zermelo 1894 



der „stetigen Fortsetzung" der Schar U hier anwendbar, wenn jetzt die con- 
stanten Grossen a v , to, to durch die variablen u u , t' 0 , t' 3 ersetzt werden, so 
dass mit Riicksicht auf (123) fiir geeignete Werte der Grossen Tq M \ uj u : 

D ^ [tp(t + t 0 , u + w) - (/?(£)] f , = 0 
[ip(t + t 0 , u + w) - ip(t)\ ^ = 0 
D^[p(t + T3,u + uj)-ip{t)] t , 3 = + e) - <p{^ (t' 3 ) 

[ip(t + t 3 ,u + u)- t , = ip^ (t ' 3 + e) - xp^ (t 3 ) 

(H = 0,l,...n-1) , 

oder, wenn man fiir kleine ]e| setzt: 

I u> v = + (e ) 2 , t^ ] = et^ + (e ) 2 , t 3 (m) = £t 3 m) + (e ) 2 , 

gemass (107a) beiderseits nach Potenzen von e entwickelt und die Coefficien- 
ten von e l einander gleich setzt: 



ip'(t,u)To + J2‘Pv(t,u)oj l , =0 



D » 
D » 



4>'(t, u)t 0 + J2 1p„(t, u)uj,. 



= 0 



ip'{t,u)r 3 + , =V^ + 1 \t 3 ) 

V J ^3 

^\t,u)T 3 +Yj'Pv{t,u)u v t =^^ + 1) (t' 3 ) 

ii J tr. 



(124a) 



(M = 0, 1, • • .n — 1) , 



wo wegen (123): 



Vi l + 1) (t 3 ) = + 1) (t 3 ,u ) = D^tp' (t,u) t ' 3 , 

V , 1 M + 1) (4) = ‘>P^ + 1) (t 3 ,u) = D^ip'(t,u)t' 3 , 
die beiden letzten Zeilen in der Form geschrieben werden konnen: 



D » 



ip'(t,u)(T 3 - d+E ip v (t,u)u v 

V 

ip'(t, u)(t 3 - 1 ) + ^2 



= 0 



= 0 



Dieses System (124a) von 4 n linearen und homogenen Gleichungen mit 
den 4n Unbekannten: uj v , Tq^, t 3 ^ — e^o kann aber nur bestehen, wenn 




Investigations in the calculus of variations 



177 



continuation” of the family U apply here, if the constant quantities a v , to, 
to are replaced by the variable ones u v , t' 0 , t 3 , so that, considering (123), for 
suitable values of the quantities Tq M \ t 3 ^\ lu u , 

D M [ip(t + t 0 , u + ui) - ip(t)\ t , = 0 

D M [i/j{t + T 0 , u + u) - ip(t)\ t , = 0 

[ip{t + To, u + w) - ¥?(*)] t , = (i' 3 + e) - (t' 3 ) " " ! 1 

[ip(t + To, u + u) - V’(l)] t , = (*3 + £) ~ (* 3 ) 

(/x = 0, 1, . . . n — 1) , 



or, if, for small |e|, we put 

u v = eu v + (e) 2 , r 0 (Ai) = £t ( 0 m) + (e )2 , t' 3 (m) = £t 3 m) + (2)2 , 

expand, according to (107a), in powers of e on both sides and equate the 
coefficients of e 1 with one another: 



D » 
D » 
D» 



= 0 



u)to + I] <p„(t, u)uj„ 

V 

4>'{t, u)t 0 + X) i’vit, u)u v 
u)t 3 + X ] u )Uv 

V 

1p'(t,u)T 3 +Y^ll>v(t,u)tJ v ' =^ +1 \t'o) 

V J *3 

(/x = 0, 1, ... n — 1) , 



= 0 

fc 0 

„ =Wi 



^ + 1 \t'o) 



(124a) 



where, on account of (123): 

tp[ fl + 1) (t , 3 ) = ip^ + 1 \t 3 ,u) = D^ip'(t, u) t ' 3 , 
ip { i +1 \t 3 ) = ^ +1) (t' 3 ,u) = u) t ' 3 , 

the last two lines can be written in the following form: 



D » 



<p'(t, u)(t 3 - 1) + ^2 “V* 

V 

ip'(t, u)(t 3 - 1) + ^2 4>v{t, u)u v 



= 0 
= 0 . 



This system (124a) of 4n linear and homogeneous equations in the 4n un- 
knowns: £U„, t 3 ^ — e Pi o can only obtain, if either the determinant 




178 Zermelo 1894 



entweder die Determinante u) = 0 ist, oder die Unbekannten samtlich 

verschwinden. Da hier das erste ausgeschlossen ist, so muss 



UJ V = lim — 
£— o e 




= A' 




= 0 



sein fiir v = 1,2,... 2 n. 

Sollte dies aber fiir ein endliches Stiick von a iiberall stattfinden, so miisste 
u u = const, sein, d. h. dieser Teil von a fiele wegen (123) mit einer particula- 
ren Losung u der Differentialgleichung, mit einem Individuum der Schar U , 
vollstdndig zusammen, also mit a selbst, wenn a diese Eigenschaft (122) in 
seiner ganzen Ausdehnung zwischen 1 und 2 besitzen soil. Fiir andere Cur- 
ven a unserer „Nachbarschaft“ ausser a ist also diese Annahme mit unseren 
Voraussetzungen unvertraglich, q. e. d. 

Wir sind daher zur Aufstellung des folgenden Satzes berechtigt: 

Satz VIII. Eine particuldre Losung a der Differentialgleichung des Pro- 
blems: x = tp(t) = ip(t , a), y = ip(t) = ip(t, a), fiir welche zwischen den Punk- 
ten 1 und 2, d.h. im Intervall t\ ^ t ^ t 2 die Functionen ip^(t), 
sowie auch ipl^ft), i (t) (y = 0, 1, . . . n; v = 1,2, . . . 2n ) eindeutig, endlich 
und stetig sind, ff(t) und ip'(t) niemals gleichzeitig verschwinden und end- 
lich die Function 0(ti,t;a) [cf. (109) und (112)] von 0 verschieden ist, liefert. 
nach der Definition des zweiten Abschnittes und fiir r = n — 1 in (34) ein 
wirkliches Minimum des Integrates 



*2 




1 1 



1. in einer Nachbarschaft m = n — Iter Ordnung [cf. (37)], wenn gleich- 
zeitig die Function Eft) = E (x^ in der Bedeut.ung (73) 
und (78) fiir x^ = ip^(t), y^ = ip^ft) und fiir beliebige Werte der x^ n \ 
y im ganzen Intervall nur positive Werte besitzt und auch an den Stellen, 
wo sie etwa verschwindet, ihr Vorzeichen nicht wechseln kann; 

2. in einer Nachbarschaft m = n ter Ordnung, wenn statt dessen die 

Function F\ (x^\y^) fiir x*-^ = y ^ = ip^\t) im ganzen Intervall 

nur positiv ist und, auch wo sie verschwindet, ihr Zeichen nicht wechseln 
kann. 

In den Satzen V und VI wurde die Erfiillung der Bedingungen (77) 
E(x^\y^\x^\y^) ^ 0 und (80) F 1 (x^ ,y^) ^ 0 in der ganzen 
Ausdehnung von a als notwendig fiir das Bestehen eines Minimums bei 
r = m = n — 1, von (80) auch bei m = n, erwiesen. Wir sind nunmehr 
im stande, die Notwendigkeit der letzten Bedingung (80) auch unter der all- 
gemeineren Annahme r^n — 1, darzuthun, wahrend r < n — 1 durch 

den Satz IV ausgeschlossen ist. Ware namlich auf einer Curve a, welche nicht 
zu den singularen im Sinne von S. 85 gehort, an irgend einer Stelle t des 




Investigations in the calculus of variations 



179 



0(t' o ,t' 3 -,u) = 0 or the unknowns all vanish. But since the first case is ex- 
cluded here, we must have 



-i . 

= hm — 

£=0 £ 





= 0 



for v = 1, 2, . . . 2 n. 

But if this should be the case everywhere for a finite segment of a, then we 
would have to have u v = const., i.e., this part of a would completely coincide 
with a particular solution u of the differential equation, with an individual of 
the family U, on account of (123), and hence with a itself, provided that a 
is to possess this property (122) in its entire extension between 1 and 2. 
Hence, as far as other curves a of your “vicinity” except a are concerned, this 
assumption is incompatible with our presuppositions, q. e. d. 

We are thus entitled to present 

Theorem VIII. A particular solution a of the differential equation of 
the problem: x = ip(t) = ip(t,a), y = if(t) = if(t,a), for which, between the 
points 1 and 2, i. e., on the interval ti f t f t 2 , the functions ip^ft), 

and also ( t ), ipi ^ ( t, ) (p = 0 , 1 , . . . n; u = 1 , 2 ,... 2 n), are single-valued, fi- 
nite and continuous, p'(t) and if' [t) never vanish simultaneously and, finally, 
the function Q(t\,t',a) [cf. (109) and (112)] is everywhere different- from 0, 
furnishes, according to the definition in the second section and for r = n — 1 
in (34), a real minimum of the integral 



J = 




dt 



1. in a vicinity of m = n — 1th order [cf. (37)], if, at the same time, 
the function Eft) = E {x^\y^^\ x^ n \y^) in the sense of (73) and (78) 
for x^ = ip^ft), y^ = if^(t) and for arbitrary values of the x^ n \ y ^ 
takes only positive values in the entire interval and never changes its sign, 
not even at positions where it vanishes; 

2. in a vicinity of m = nth order, if, instead, the function F\ (x^\y^^j 

is only positive for x^ = y ^ = if^ff) in the entire interval and 

does not change its sign, not even where it vanishes. 

In Theorems V and VI, we proved that the satisfaction of the condi- 
tions (77) E(x^\y^-,x^\y^) ^ 0 and (80) F 1 (*(**), yW) ^ 0 in the 
entire extension of a is necessary for the existence of a minimum when 
r = m = n — 1, and, with respect to (80) also when m = n. We are now able 
to show that, even under the more general assumption r ^ n— 1, m ^ n, the 
last condition (80) is necessary, while r < n — 1 is excluded by Theorem IV. 
For if we had F\ (x^\y^) < 0 at any position t of the interval t\ if t t ‘2 
on a curve a that does not belong among the singular curves in the sense 




180 



Zermelo 1894 



Intervalls t\ ^ t ^ t 2 Fi (x^ , y^) < 0, so bestande diese Ungleichheit we- 
95 gen der Stetigkeit von F\ auch fur ein ganzes | Intervall t' ^ t ^ t" . Hier 
konnte aber t" — t' so klein und eine neue Stelle to < t' so nahe an t' an- 
genommen werden, dass nicht nur bestandig ±0{t' ,t) > 0, sondern auch 
± 0(to,t) > C > 0 ware fur das ganze Teil-Intervall t' ^ t ^ t" . Ferner konn- 




Fig. 7. 



te man eine Curve c bestimmen, welche a in den Punkten t' und t" von der 
Ordnung r ^ n — 1 beriihrt, bis zu den Ableitungen r ter Ordnung stetig ist 
und einer beliebig engen Nachbarschaft m ter Ordnung (37) von a angehort. 
Nun konnten in (37) die Grossen (/x = 0,1,... n) so klein gewahlt werden, 
dass auch (vergl. die Bemerkungen bei (121)) E( A) < 0 ware in der ganzen 
Ausdehnung von c und daher nach (105a) J — J < 0, wenn unter J das liber a 
erstreckte Integral verstanden wird, unter J aber dasselbe Integral, in wel- 
chem nur der zwischen t' und t" liegende Teil von a durch c ersetzt ist. Somit 
wlirde a die Eigenschaft des Minimums nicht besitzen, da die betrachtete 
Variation jedenfalls zu den „erlaubten“ zu zahlen ist. 

In Bezug auf die „ conjugierten Punkte u , deren Vorhandensein flir VIII 
ausgeschlossen wurde, ist es mir allerdings noch nicht gelungen, mittelst der 
hier iiberall angewandten „Methode der benachbarten Losungen" zu sicheren 
Kriterien zu gelangen, wann sie mit dem Bestehen eines Minimums vereinbar 
sind, und ich muss hieriiber auf die in der Einleitung erwahnte Arbeit von 
Scheeffer (Math. Ann. XXV) verweisen; doch mogen hier noch die folgenden 
Bemerkungen Platz finden. 

In der Nahe eines zu 0 auf a conjugierten Punktes 3, flir welchen 
@(< 0 ) I 3 ; a) = 0 ist, existiert unter gewissen, sehr allgemeinen Bedingungen 
eine Curve a = g (x = ^(A), y = ip( A)), welche alle Curven u, darunter a, 
einer Schar U von der definierten Eigenschaft der Reihe nach von n ter, statt 
nur von n — 1 ter Ordnung beriihrt., ohne selbst mit einer Curve u zusam- 
menzufallen, und die als „Enveloppe u der Schar U bezeichnet werden moge 
96 in einem von dem gewohnlichen etwas abweichenden Sinne | des Wortes. Auf 
dieser „Enveloppe“ muss einerseits, unseren Betrachtungen von (124a) zu- 
folge, bestandig 0(t' o ,t' 3 -, u) = 0 sein, d. h. sie enthalt die zu 0 conjugierten 
Punkte 3' aller u, andererseits aber ist nach (122) hier bestandig k — k = 0 
und daher E( A) = 0. 

Nimmt man also ti ^ to < ^3 ^ t -2 an und betrachtet die erlaubte Varia- 
tion 0 4 3 von 0 3 oder 1 0 4 3 2 von 1 2, wo 4 einen Punkt von g und 0 4 




Investigations in the calculus of variations 



181 



specified on p. 85, then, on account of the continuity of F\, this inequality 
would also obtain for an entire interval t' ^ t ^ t". But, in this case, it would 
be possible to choose t” — f so small and a new position to < t' so close to t' 
that we would not only always have ± 0(t' , t) > 0 but also ± 0(to, t) > ( > 0 
for the entire partial interval t' 5) t ^ t" . Furthermore, it would be possible 




Fig. 7. 



to determine a curve c that has contact of the order r ^ n — 1 with a at the 
points f and t" , is continuous up to the derivatives of rth order and belongs 
to an arbitrarily narrow neighborhood of mth order (37) of a. It would now 
be possible to choose the quantities (/z = 0,1,... n) in (37) so small that 
we would also have E( A) < 0 (cf. the remarks at (121)) in the entire exten- 
sion of c, and hence, by (105a), J — J <0, assuming that J is the integral 
taken along a, while J is the same integral in which only the part a lying 
between t! and t" is replaced by c. Thus, a would lack the property of the 
minimum, since there is no doubt that the variation under consideration is 
to be considered “admissible”. 

But in regard to the “ conjugate points''’ whose existence was excluded 
for VIII, I have failed thus far to arrive at reliable criteria by means of the 
“method of neighboring solutions”, which is used everywhere here, to deter- 
mine when they are compatible with the existence of a minimum, and, on this 
matter, I have to refer to the paper by Scheeffer (1885), which I mentioned 
in the introduction; nevertheless, let me add the following observations here. 

Nearby a point 3 conjugate to 0 on a for which 0(to,to\ a) = 0 there exists, 
under certain, very general conditions, a curve a = g (x = Tp( A), y = ^(A)) 
that has contact of the n th, instead of only n — 1 th, order with all curves u, 
including a, of a family U of the defined property successively, without itself 
coinciding with a curve u, which shall be called the “ envelope ” of the family U, 
in a slight departure from customary usage of that word. According to our 
considerations concerning (124a), we must, on the one hand, always have 
0(t'o,t' 3 \u) = 0 on this “envelope”, i.e., it contains the points 3' conjugate 
to 0 of all u, but, on the other hand, by (122), always k — k = 0 here, and 
hence E( A) = 0. 

Thus, if we suppose that t\ ^ to < ^3 ^ ^2 and consider the admissible 
variation 0 4 3 of 0 3, or 1 0 4 3 2 of 1 2, where 4 denotes a point of g and 0 4 




182 Zermelo 1894 



die zu 4 gehorige Curve u bedeutet, so ist nach (105) 



Jo 43 — J 03 — 



J E( A) dX = 0 



A4 



und daher J 10432 = J12 , 

wahrend 4 immer so nahe an 3 angenommen werden kann, dass 0 4 3 einer 
beliebig engen Nachbarschaft n — 1 ter, ja, weil die Beriihrung in 3 von nter 
Ordnung ist, auch von ?rter Ordnung angehort. Da aber g der Differential- 




gleichung des Problems (von der Ordnung 2 n) im allgemeinen nicht geniigt, 
so kann immer J43 und damit auch J10432 = J12 nach II durch beliebig kleine 
Variationen noch verkleinert werden, so dass fiir a ein Minimum sicher nicht 
besteht. 

Dieses Ergebnis ist im wesentlichen die Verallgemeinerung einer zuerst 
von Herrn Lindelof (Moigno et Lindelof, Legons cle Calcul Differentiel et 
Integral; IV Calcul des Variations) entdeckten Eigenschaft der Kettenlinie 
in Bezug auf den Inhalt der Rotationsflache / y ds, bei welcher zwei auf der 
Rotationsachse sich schneidende Tangenten die Rolle der „Enveloppe“ spielen. 
Dagegen fehlt mir noch ein einfaches Kriterium fiir die Existenz einer solchen 
allgemeineren „Enveloppe“ von den vorausgesetzten Eigenschaften. 

Die in der vorliegenden Arbeit abgeleiteten Satze, namentlich II, V, VI 
und VIII, gestatten jetzt eine vollstandige Entscheidung iiber das Vorhan- 
densein eines Minimums in clem betrachteten Sinne und damit die Losung 
der gestellten Aufgabe mit alleiniger Ausnahme der folgenden Falle: 

1. wenn m < n — 1 sein soil, d. h. wenn ein Minimum in einer „Nachbar- 
schaft“ von niedrigerer als der n — 1 ten Ordnung verlangt wird, 

2. wenn m = n — 1 und gleichzeitig r > n — 1 sein soil (cf. V), 

3. wenn auf a die Functionen P, Q in ganzen Intervallen unstetig sind 
(cf. II), 

4. wenn a gemass IV aus einer Anzahl verschiedener particularer Losun- 
gen der Differentialgleichung zusammengesetzt ist, 

5. wenn a nicht sammtlichen in VIII vorausgesetzten Stetigkeitsbedin- 
gungen, sondern nur den durch (32) bis (35) ausgedriickten geniigt, d. h. 
wenn die (t) nicht alle stetig sind oder wenn a sich zwischen 1 und 2 
irgendwo selbst durchschneidet, 




Investigations in the calculus of variations 



183 



the curve u belonging to 4, then, by (105), 



• 1(343 



33 



A3 

J E( A) d\ = 0 

A4 



and hence J 10432 = J 12 * 

while we can always take 4 so close to 3 that 0 4 3 belongs to a arbitrarily 
narrow neighborhood of the n — 1th, and even nth, order since the contact 




at 3 is of ?rth order. But since g does not, in general, satisfy the differential 
equation of the problem (of order 2 n), J 43 , and hence also J 10432 = J 12 can , 
by II, always be scaled down further by means of arbitrarily small variations 
so that there is certainly no minimum for a. 

This result is essentially a generalization of the property, which was first 
discovered by Lindelof ( Lindelof and Moigno 1861, IV: Calcul des Varia- 
tions), of the catenary with respect to the volume of the surface of revolution 
/ y ds, where two tangents intersecting one another on the axis of rotation 
play the part of the “envelope”. By contrast, I still have no simple criterion 
for the existence of such a more general “envelope” of the properties assumed. 

The theorems derived in the present work, and in particular II, V, VI 
and VIII, now provide for a complete decision on the existence of a minimum 
in the sense under consideration, and hence the solution of the task before 
us, except only in the following cases: 

1. when we are supposed to have m < n — 1, i. e. , when a minimum is 
required in a “neighborhood” of an order lower than n — 1, 

2. when we are supposed to simultaneously have m = n — 1 and r > n — 1 
(cf. V), 

3. when the functions P, Q are discontinuous on a on the entire interval 
(cf. II), 

4. when a is composed of several different particular solutions of the 
differential equation in accordance with IV, 

5. when a does not satisfy all continuity conditions assumed in VIII, 
but only those expressed by (32) to (35), i. e., when the (t) are not all 
continuous or when a intersects itself somewhere between 1 and 2, 




184 



Zermelo 1894 



6. wenn a schon im betrachteten Intervall den zum Anfangspunkt „con- 
jugierten“ Punkt t[ enthalt: f)(t \ , t\ : a) = 0, ohne dass dort eine stetige „En- 
veloppe“ g existiert, 

7. wenn F\ oder E\ an einzelnen Stellen von a verschwindet, in deren 
Umgebung die Function negativ werden kann (cf. V, VI und VIII), sonst 
aber auf a immer positiv ist. 

Nur in einem dieser Ausnahmefalle wiirde die Entscheidung unter Um- 
standen erst von einer Special-Untersuchung abhangig zu machen sein. 



Zum Schlusse ist es mir Bediirfnis, Herrn Prof. H. A. Schwarz, dem ich die 
erste Anregung zu meiner Arbeit und vielfache Unterstiitzung durch warmes 
Interesse und giitige Ratschlage verdanke, meinen ergebensten Dank an dieser 
Stelle auszusprechen. 



98 | Thesen. 

I. 

In der Variations-Rechnung ist auf eine genaue Definition des Maximums 
oder Minimums grosserer Wert als bisher zu legen. 



II. 

Mit Unrecht wird der Physik die Aufgabe gestellt, alle Naturerscheinungen 
auf Mechanik der Atome zuriickzufiihren. 



III. 

Die Messung ist aufzufassen als das iiberall anwendbare Hilfsmittel, stetig 
veranderliche Qualitaten zu unterscheiden und zu vergleichen. 



99 | Vita. 

Natus sum Ernestus Zermelo Berolini a. d. VI Kal. Aug. anni MDCCC- 
LXXI patre Theodoro, matre Augusta e gente Zieger, quos morte praematura 
mihi ereptos valde lugeo. Fidei addictus sum evangelicae. Primis litterarum 
elementis imbutus postquam per novem annos gymnasium, quod dicitur „Lui- 
senstadtisches“, Berolini frequentavi, maturitatis ibi testimonium vere anni 
h. s. LXXXIX adeptus sum. Turn in philosophorum ordinem receptus per 
quinque annos et sex menses studiis praecipue mathematicis, physicis, phi- 
losophicis operam dedi, primum Berolini, deinde sex menses Halis Saxonum, 




Investigations in the calculus of variations 



185 



6. when a contains the point t\ : 0(t \ , t \ ; a) = 0, that is “conjugate” to 
the starting point, already in the interval under consideration without there 
being a continuous “envelope” g, 

7. when Fj or E\ vanishes at particular positions of a in whose vicinity 
the function may become negative (cf. V, VI and VIII), while otherwise it is 
always positive on a. 

It is only in these exceptional cases that the decision may depend on the 
results of a separate investigation. 



In closing, I would like to express my deepest gratitude to Prof. H. A. 
Schwarz, who got me started in this project in the first place and supported 
my work in many ways, for his warm regard and gracious counsel. 



Theses . 

I. 

It is necessary to place greater emphasis than before on a precise definition 
of the maximum or minimum in the calculus of variations. 



II. 

It is wrong to charge physics with the task of reducing all phenomena of 
nature to the mechanics of atoms. 



III. 

Measurement is to be conceived of as a universally applicable expedient 
for the distinction and comparison of continuously varying qualities. 



Vita. 

I was born Ernst Zermelo in Berlin on the 27th of July of the year 1871, of 
father Theodor and mother Auguste nee Zieger, taken from me by premature 
deaths that caused me great sadness. I am of Protestant faith. After I had 
attended the Luisenstadtisches Gymnasium in Berlin for nine years and had 
become well-acquainted with the first elements of literature and science, I 
attained my certificate of graduation in the spring of the year 1889. I was 
then admitted to the philosophical faculty. For five years and six months I 
studied mathematics, physics, and philosophy in particular, first in Berlin, 
then for six months in Halle in Saxony, then for six months in Freiburg im 




186 Zermelo 1894 



turn sex menses Friburgi Brisgaviae, postremo iterum Berolini. Audivi viros 
doctissimos: 

Aron, Dilthey, Ebbinghaus, Forster, Frobenius, Fuchs, Gian, Hettner, 
Knoblauch, E. Kotter, Kundt (f), Lehmann-Filhes, Paulsen, Planck, de Richt- 
hofen, Schlesinger, E. Schmidt, Schwarz, Wien (Berolini), Cantor, Erdmann, 
Haym, Husserl, Ule, Wangerin (Halis), Liiroth, Miinsterberg, Riehl, Stickel- 
berger, Warburg, Weissenfels (Friburgi). Berolini seminarii mathematici, cui 
praesunt Fuchs, Schwarz, Frobenius, per duos annos et sex menses sodalis 
fui, exercitationibus, quas Planck de rebus mathematicis physicis instituere 
solet, per quinque semestria interfui. 

Omnibus viris, qui me docuerunt, imprimis Lazaro Fuchs, Hermanno 
Amando Schwarz, Maximiliano Planck, summas gratias ago semperque habe- 
bo, nec non viro illustrissimo Aemilio Lampe, cuius consiliis et benevolentia in 
studiis meis magnopere adiutus sum. Societati quoque mathematicae, cuius, 
quamdiu Berolini aderam, sodalis fui, multum me debere confiteor. 




Investigations in the calculus of variations 



187 



Breisgau, and finally again in Berlin. I attended lectures given by the very 
learned men: 

Aron, Dilthey, Ebbinghaus, Forster, Frobenius, Fuchs, Gian, Hettner, 
Knoblauch, E. Hotter, Kundt (f), Lehmann-Filhes, Paulsen, Planck, von 
Richthofen, Schlesinger, E. Schmidt, Schwarz, Wien (Berlin), Cantor, Erd- 
mann, Haym, Husserl, Ule, Wangerin (Halle), Liiroth, Miinsterberg, Riehl, 
Stickelberger, Warburg, Weissenfels (Freiburg). For two years and six months 
I was a member of the Berlin Mathematical Seminar run by Fuchs, Schwarz, 
and Frobenius. For five semesters I took part in the exercises of mathematical 
physics offered by Planck. 

I am, and always will be, very grateful to all the men who taught me, in 
particular to Lazarus Fuchs, Hermann Amandus Schwarz, Max Planck, and, 
last not least, to the very excellent Emil Lampe whose advice and goodwill 
were very helpful and encouraging during my studies. I affirm that I also 
acknowledge a great debt to the Mathematical Society of which I was a 
member during my time in Berlin. 




Introductory note to 1896a, 1896b, 
and Boltzmann 1896, 1897 

Jos Uffink 

In 1896, Zermelo published a paper containing what is commonly called the 
recurrence objection ( Umkehreinwand ) to Ludwig Boltzmann’s approach to 
statistical physics. This paper immediately gave rise to a heated dispute with 
Boltzmann, a controversy comprising the four papers discussed here, two 
from each author. 1 Zermelo returned to the disputed topic in 1899, when he 
chose the foundations of statistical physics as the subject of his Habilitation 
lecture {1900) at the University of Gottingen. 

Apart from this encounter with Boltzmann, Zermelo was also strongly 
interested in Josiah Willard Gibbs’ approach to statistical physics. Indeed, 
he translated Gibbs’ book, Elementary Principles in Statistical Mechanics 
( 1902 ), into German {1905) and wrote a critical review {1906) of the book, 
which concludes his papers devoted to statistical physics. 

This commentary provides some background, both historical and with re- 
spect to the current understanding, of Zermelo’s involvement with the foun- 
dations of statistical physics. Section 1 provides some historical background 
to the Zermelo-Boltzmann controversy; Section 2 deals with the conceptual 
preliminaries to the controversy; and Section 3 is devoted to a discussion of 
the controversy itself. 



1. The background to the Zermelo-Boltzmann 
controversy 

1.1. Theoretical physics in the late 19th century: 

Boltzmann and Planck 

When Zermelo wrote his 1896a,b, he was working in Berlin as an assistant 
to Max Planck, at that time already a prominent theoretical physicist. The 
relations between Planck and Boltzmann, and their different positions in the 
problems facing theoretical physics in that era, seem crucial to the under- 
standing of the ensuing dispute. 

In the late nineteenth century, the foundations of physics was in a state 
of flux. Many new developments from thermodynamics, chemistry, electro- 
dynamics, and various other areas, both theoretical and experimental, were 

1 Arguably a third paper by Boltzmann {1897b), entitled “On a mechanical theo- 
rem by Poincare” and not included in this volume, also contains a response by 
Boltzmann to Zermelo. 



H.-D. Ebbinghaus, A. Kanamori (Eds.), Ernst Zermelo - Collected Works/ 
Gesammelte Werke II, DOI 10.1007/978-3-540-70856-8_3, Schriften der 
Mathematisch-naturwissenschaftlichen Klasse der Heidelberger Akademie 
der Wissenschaften 23, © Springer- Verlag Berlin Heidelberg 2013 
Introductory Note, © Springer- Verlag, Papers, © Wiley-VCH 
English translations with kind permission by Springer- Verlag 




Introductory note to 1896a, 1896b, and Boltzmann 1896, 1897 189 



challenging traditional physical views, and one might well say that a lively 
competition was going on between different world views. Many physicists dis- 
cussed the question of which world view would provide the best prospect for 
the unity and progress of physics. 

Some authors, in particular Boltzmann, argued that the most promising 
way to move forward was to adopt the view that matter consists of molecules 
or atoms moving in accordance with the laws of classical mechanics. This 
view is sometimes called “mechanicism” (or “the mechanistic view of nature”, 
etc.) or “atomism”, depending on which of these two independent ingredients 
one wishes to emphasize, and had been successfully employed in the theory 
of gases by James Clerk Maxwell and Boltzmann in the 1860s and 1870s. 

Other authors, like Planck and Pierre Duhem, believed that thermody- 
namics provided the most trustworthy framework for the future of theore- 
tical physics. This theory refrained from asking whether matter is composed 
of atoms, moving in accordance with classical mechanics. Rather, it concen- 
trated on describing empirical relations between various observable quanti- 
ties, like temperature, pressure, volume, energy, and so forth. And whenever 
the issue arose about how such quantities would vary from one part of a 
material system to another, the usual approach was to model the system as 
a continuum rather than as being composed of atoms. Thermodynamics had 
emerged in the 1850s through the work of Rudolf Clausius and Lord Kelvin, 
but developed rapidly in the late 19th century to become a most versatile 
theory with a wide range of applications in chemistry, magnetism, optics and 
other areas far removed from the original study of heat engines. 

Still others placed their trust in electrodynamics and proposed a world 
view in which the electromagnetic fields were considered as the most funda- 
mental entities, and particles were taken as mere singularities in these fields. 
A further view, the ill-fated but briefly popular view of energeticism advo- 
cated by Wilhelm Ostwald and Georg Helm, argued that energy, and its 
various ways of transformation, was be the sole universal concept needed in 
physics. 

This struggle between the various world views in theoretical physics and 
the positions taken by the various authors inevitably came to depend not 
only on physical arguments, but also on a variety of philosophical themes, in 
particular the question of what the goal of a physical theory ought to be: to 
aim for a literally true description of the physical world (realism) or rather 
to describe and predict the results of experiments (empiricism) . 

1.2. Boltzmann and Planck’s views 

In this confusing situation, Boltzmann and Planck came to hold opposing 
views. Boltzmann championed the mechanistic-atomistic approach in his 
work on the theory of gases and physics in general. This view clearly em- 
braced the atomic hypothesis which in those days could not (yet) be checked 
by empirical evidence. However, this should not lead one to jump to the 




190 Jos Uffink 



conclusion that Boltzmann was a naive realist about the existence of unob- 
servable atoms. He was well aware of the point, and indeed one of the first 
physicists to emphasize, that concepts like “atoms” are mere images ( Bilder ) 
that we impose on natural phenomena, and that we can never be sure about 
whether our images actually correspond to physical reality (cf. de Regt 2005). 
His arguments for defending mechanicism were epistemological, or perhaps 
strategic, rather than ontological: Boltzmann argued (1897a, c) that the al- 
ternative view of matter as a continuum, as was common in thermodynamics, 
involved an equally untestable hypothesis, and claimed that the mechanistic- 
atomistic view provided the best prospect for moving forward in physics, even 
if we are not sure that atoms really exist. 2 

However, in the course of his lifelong work on the subject, Boltzmann 
came to recognize that the hypothesis that matter is constituted of atoms of 
finite size and mass, moving in accordance with classical mechanics, would 
not suffice to explain the thermal behavior of macroscopic bodies, even in 
the case of gases. The result by Boltzmann which had most impressed his 
contemporaries was his famous //-theorem of 1872 (to be discussed in more 
detail below). This theorem seemed to explain the irreversible approach to- 
wards thermal equilibrium for gases and was first presented by Boltzmann as 
being derived from purely mechanical suppositions. However, the derivation 
of this result turned out to be more subtle than Boltzmann had originally 
presented it. Additional suppositions in the form of arguments from probabil- 
ity theory or statistical considerations were needed, and Boltzmann gradually 
came to appreciate and admit these suppositions. Indeed, it seems fair to say 
that Boltzmann never produced a clear account of how the mechanical and 
statistical ingredients would interrelate to provide the desired derivation of 
the approach to equilibrium and prove his //-theorem. How Boltzmann’s H- 
theorem would explain the irreversible behavior of a gas towards equilibrium 
and what assumptions were needed for such an explanation turns out to be 
an extraordinarily subtle matter, even today. However, this issue was not the 
focus of his disagreements with Planck. 

Planck, on the other hand, took a position that is less easy to describe. 
He certainly took thermodynamics, which had been the mainstay of his own 
work since the 1880s, as a more trustworthy theory than the mechanistic- 
atomistic view. However, he did not take the continuum view of matter as 
an essential part of his world view. For example, in 1887 he had advanced, 
simultaneously with (but independent of) Svante Arrhenius, the proposal 
that salts, upon dissolution in water, split into electrically charged ions, in 
order to explain the empirical fact that the change in the boiling or freez- 
ing temperature and electrical conductivity of a solvent is strikingly different 
when a salt rather than a neutral substance (like sugar) is dissolved. Clearly, 
Planck had no problem in entertaining the hypothesis that matter is made 



2 The Boltzmann biography by Carlo Cercignani (1998) is therefore aptly called 
The man who trusted atoms, rather than The man who believed in atoms. 




Introductory note to 1896a, 1896b, and Boltzmann 1896, 1897 191 



of unobservable particles. But he was certainly not dogmatic about atomism, 
and always stressed that one should be most careful in assuming too many 
hypotheses about their properties. In particular, the assumption that they 
move in accordance with the laws of classical mechanics and that this would 
suffice to explain the thermal behavior of macroscopic matter seems to have 
been a bridge too far for Planck in the 1890s. Further, Planck agreed with 
Boltzmann in rejecting the position of the instrumentalists like Ernst Mach 
who argued that atomism was mistaken as a matter of methodological prin- 
ciple. Moreover, Planck and Boltzmann basically agreed in their rejection of 
energeticism in the debate on this view in 1894-5. 

Instead, Planck’s main disagreement with Boltzmann in the 1890s cen- 
tered on two issues. First of all there is the issue of irreversibility. Throughout 
his writings, at least until 1909, Planck regarded the irreversibility of thermal 
phenomena, as encapsulated by the second law of thermodynamics, as having 
absolute validity, and he criticized mechanistic views of nature for making the 
irreversible behavior of matter a merely statistical, rather than an absolute 
certainty. Indeed, Planck’s “personal reminiscences” describe his differences 
with Boltzmann (1946): 

Boltzmann knew quite well that my point of view was actually rather 
different from his. In fact, he became angry that I was not only in- 
different towards the atomistic theory, but even a little bit negative. 

The reason was that at this time I attributed the same exceptionless 
validity to the principle of the increase of entropy as to the princi- 
ple of the conservation of energy, whereas with Boltzmann the former 
principle appears only as a probabilistic law which as such also admits 
exceptions. 

Furthermore, Planck argued that the mechanistic-atomistic view adopt- 
ed in the kinetic theory of gases had simply failed to yield enough recent 
significant results in the 1890s, and contrasted this to the progress in ther- 
modynamics. Thus he wrote (1891): 

Anyone who has studied the works of Maxwell and Boltzmann -the 
two scientists who have penetrated most deeply into the analysis of 
molecular motion -will scarcely be able to escape the impression that 
the remarkable physical insight and mathematical skill exhibited in 
conquering these problems is inadequately rewarded by the fruitful- 
ness of the results gained. 

Similarly, his election address to the Prussian Academy stated (1894b): 

At present, the theoretical physicist is faced with problems of a higher 
difficulty than a generation ago. In those days, there was for every- 
one who searched for a big encompassing idea in the exact natural 
sciences, or an all-embracing world view, only a single [. . . ] goal: the 
reduction of all natural processes to mechanics. This view has con- 
tributed many rich results to science, even when the audacious hope 




192 Jos Uffink 



of following every single molecule or even every atom by measurement 
could not be realized. Still, in the irregular to-and-fro which reigns 
even in the smallest observable spaces in a gas containing billions of 
gas molecules, the statistical method has delivered many correspond- 
ing results. 

Today, however, this effort, directed at this ultimate goal, has come 
to a stand-still, and has given rise to a certain disillusionment. In- 
deed, the mere mathematical analysis needed to penetrate further 
into these complicated kinds of motion already meets with unsur- 
mountable difficulties | . . . ] . 

In contrast, Planck continues: 

Recent physical research has witnessed a breakthrough in the ten- 
dency to forego the attempt to search for the connection of the phe- 
nomena in mechanics. [. . . ] The whole recent development of thermo- 
dynamics has been achieved by relying just on the two fundamental 
principles of the theory of heat. In particular, the fundamental rela- 
tions between electrodynamics and optics, among electric phenomena, 
chemical affinity, and thermodynamics were obtained without taking 
recourse to the mechanical view of the nature of such processes. Simi- 
larly, one expects that in the dependence of electrodynamic processes 
on temperature, as appears in the theory of radiation, one will come 
closer to an explanation without taking the tedious detour through 
the mechanical conception of electricity. 

Planck emphasized that his endorsement of the thermodynamical ap- 
proach should not be seen as a rejection of the idea that the ultimate goal 
of physics is the reduction of all phenomena to mechanics. He did not reject 
this goal as a matter of principle, but simply claimed that recent attempts 
to attain this ultimate goal had been fruitless. 

Planck’s judgment of the recent developments in theoretical physics in 
the 1890s is actually not hard to understand. The mechanistic view, which 
underlies the kinetic theory of gases, had produced quite remarkable and 
unexpected results in the 1860s and 1870s, e.g. the prediction by Maxwell 
{I860) that the viscosity of a gas should be independent of its density, or the 
explanation of phase transitions in terms of intermolecular forces by Johannes 
Diderik van der Waals {1873). But in the 1880s and 1890s there was little to 
follow up these early successes. Boltzmann himself, for example, devoted great 
effort in the 1880s to calculating the viscosity of gases from the perspective of 
the mechanistic-atomistic view, taking his Boltzmann equation as a starting 
point. But these calculations got him nowhere near to the observed values for 
gas viscosity {1880, 1881a,b). By contrast, the thermodynamical approach 
that Planck championed had been quite fertile in the 1890s, establishing 
fruitful connections between the theory of heat and prima facie unrelated 
areas like magnetism and chemistry. 




Introductory note to 1896a, 1896b, and Boltzmann 1896, 1897 193 



1.3. The succession of Kirchhoff 

Apart from their different positions on the question of what world view would 
be the most promising for the development of physics, the lives of Boltzmann 
and Planck also crossed at a professional level. 3 In October 1887 Gustav 
Kirchhoff died and left vacant the prestigious chair in theoretical physics 
at the University of Berlin. The university decided in December to offer this 
chair to Boltzmann, who, at that time, was employed as an ordinary professor 
in Graz, but also served as Rector of the University. This offer started what 
must have been one of the most peculiar job negotiations in academia. 

Boltzmann went to Berlin in the same month to negotiate the offer and 
signed an agreement on January 3, 1888 to take up his new chair in October 
1888. 4 News about this new contract soon leaked out to the local press, and 
the university administrators in Graz, who were eager to keep Boltzmann, 
started to question him. A report of such a conversation on January 8 states 
that Boltzmann denied having reached an agreement with Berlin, claiming 
instead that the negotiations were still only at an explorative stage. Of course, 
the University of Graz took this message as an indication that they could still 
be able to keep Boltzmann. 

Meanwhile, Berlin went ahead with Boltzmann’s intended appointment. 
On March 19 his appointment was made official by the Prussian king. On 
April 23 followed his election as full member of the Prussian Academy of Sci- 
ences. However, in contacts with the authorities in Graz who were inquiring 
under which conditions Boltzmann was willing to stay there, he suggested 
that he would seek to forego the position in Berlin on the basis of his bad 
eyes. 

On June 6 Boltzmann wrote to Berlin to inquire when his employment 
would start and expressed his willingness to resign from Graz. But after a 
request from Berlin that he send proof of his resignation, he sent a letter 
on June 24 requesting that he be relieved from his commitment to Berlin, 
explaining that he had gone through terrible agony and that he was suffer- 
ing from poor eyesight and nervous disorder and felt unable to take up the 
prestigious chair. On June 27 he sent a telegram requesting that his previous 
letter be left unopened. On June 28 he sent another telegram requesting that 
his last telegram be ignored and that his letter be opened. 

Thereupon, Berlin discretely approached Boltzmann’s wife, Henriette, 
about how to proceed. She replied on July 2 that the many agonizing and sad 
affairs Boltzmann had been dealing with as Rector in Graz had wrecked his 
nerves, and that the effort to decide on his resignation had excited his nerves 
further so much so that the physicians feared the worse for his health. The 

3 All the historical claims in this subsection are based on Hoflechner 1994- 

4 The prestige involved in this offer may perhaps be judged from the salaries 
mentioned by Hoflechner in 1994 '■ In Graz, Boltzmann earned a yearly income 
equivalent to around 30,000 Mark. In his new position in Berlin, Boltzmann 
would receive a yearly income of 137,000 Mark. 




194 Jos Uffink 



upshot was that Boltzmann had not resigned in Graz, although she added 
that she had witnessed how terribly difficult it had been for him to forego 
the offer from Berlin and that he was already in deepest grief about rejecting 
the offer for a position that would suit him so much. In a letter of July 9 the 
Berlin authorities confirmed Boltzmann’s release from his contract. On July 
16 Boltzmann sent another message to Berlin, stating that he was day and 
night in the most bitter resentment over a step he had taken in a moment 
of excitation and asked whether it was still possible to change his mind (i.e. 
to accept the offer after all) and offered to travel to Berlin and explain and 
apologize for his behavior. However, the authorities replied that his release 
was final. Yet, in August, Boltzmann inquired again, expressing his eagerness 
to come to Berlin. And even six years later, in 1895, Henriette Boltzmann 
wrote to the University of Berlin to inquire whether the position was still 
open. 

Berlin, however, had already made their decision. On September 27, 1888 
the Ministry of Education requested that the University of Berlin propose a 
successor to Kirchhoff, and suggested the names of Heinrich Hertz and Max 
Planck. The appointment went to Planck. 

It is, of course, a delicate matter to speculate on how historical actors 
felt in a given situation. It seems likely, however, that Boltzmann felt torn 
between the attractive offer from Berlin, where he would have been in a circle 
of other physicists he could talk to; fears that he would not be able to live 
up to expectations, due to his ailing health; and feelings of responsibility 
towards Graz. How Boltzmann felt about the failure of these negotiations we 
do not know. 5 But it is not hard to imagine how he might have felt when 
Planck, invested with the authority of his new position, which (at least part 
of) Boltzmann had sincerely desired, proceeded to state ( 1891,1894b ) that 
Boltzmann’s approach to physics seemed fruitless. 

Amongst the first tasks that Planck undertook as Kirchhoff’s successor 
was to edit Kirchhoff’s collected works, among which were Kirchhoff’s lectures 
on the theory of heat, which touched on the kinetic theory of gases ( Kirchhoff 
1894). Kirchhoff’s aims in these lectures had been much more modest than 
Boltzmann’s: in particular, Kirchhoff only considered equilibrium states and 
left the whole issue of the approach to equilibrium or the increase of entropy 
during irreversible processes untouched. However, Kirchhoff’s lectures did 
contain an argument to show that a system of gas molecules in equilibrium 
would be described by the Maxwell distribution law. 



5 There is, however, one passage in his later writing where Boltzmann returned 
to the episode. In 1905b, he eagerly quotes an anonymous American colleague 
stating that Berlin had recently declined in reputation, and that “a lot of things 
would be in much better shape if I had accepted the offer from Berlin”. Boltzmann 
added: “Many seemingly unavailable persons could have been made to come, if 
one had really wanted to have them”. 




Introductory note to 1896a, 1896b, and Boltzmann 1896, 1897 195 



Boltzmann (1894) reviewed Kirchhoff 1894 and pointed out an objection 
against Kirchhoff’s argument, and along the way, implicitly commented on 
Planck’s verdict (1891) on his own approach: 

Even those who, like the editor of the volume under consideration 
on Kirchhoff’s lectures on the theory of gases [i.e., Planck], maintain 
that this theory is unworthy of the acumen that has been applied to 
it, will not wish that those who write on this topic will do so with 
less acumen. 

Planck (1894a) responded to Boltzmann’s review in a manner which 
showed that he felt hurt by these remarks by Boltzmann. He extensively 
argued that his job as editor of Kirchhoff’s papers had nothing to do with 
his own opinions on the theory of gases, and that he took no responsibility 
for their content. However, Planck did take the occasion to mention an idea 
of his own on how Kirchhoff’s proof could be improved and saved against 
Boltzmann’s objection. 

Unfortunately, Planck’s idea was untenable, and Boltzmann (1895a) pro- 
ceeded, after an admission that he had only intended to criticize the contents 
of Kirchhoff’s ideas and never the person of the editor — and calling Planck’s 
idea “promising” — to make short shrift of Planck’s proposal. 

My main intention so far has been to argue that the relation between 
Boltzmann and Planck must have been tense in the period preceding the con- 
troversy with Zermelo. Of course, relations between Planck and Boltzmann 
did not end here. Planck went on to do epoch-making work in black-body 
radiation in 1899, and, along the way, came closer to Boltzmann’s point of 
view than he could have imagined in the mid-1890s. Indeed, Planck’s address 
at the University of Leiden (1909) has often been seen as a complete turn of 
mind in favour of Boltzmann’s viewpoint. But it would take us too far afield 
to discuss this here, since, by then, Zermelo had long left Berlin and Planck’s 
guidance. Boltzmann, on the other hand, received (and accepted) offers from 
the universities of Munich (1889), Vienna (1894), Leipzig (1900) and moved 
back to Vienna in 1902. 

Given these strained relations between Boltzmann and Planck at the time 
when Zermelo published his critique in 1896 and Zermelo’s close connection 
to Planck, Boltzmann saw Zermelo as Planck’s mouthpiece. In fact, Zermelo 
was not only employed that time as Planck’s assistant, but Planck was also 
editor of the Annalen der Physik und Chemie, the journal in which Zermelo’s 
1896a was published. Boltzmann wrote a letter to the editor-in-chief of the 
Annalen, Eilhard Wiedemann, on March 20, 1896 accompanying his reply 
( 1896 ): 

Most esteemed Colleague! 

I enclose a reply to a paper that appeared in the previous issue of 
your highly esteemed Annals by a certain Mr. Zermelo. I wrote it 
within 2 days after I received this issue, but it is not long, and I 




196 Jos Uffink 



would like to ask you kindly that it will appear as soon as possible. 
Because it seems that, today, since Maxwell, Clausius, Helmholtz etc. 
are all dead, I am the only advocate who opposes the view that the 
mechanical explanation of nature has to be given up, it appears to me 
that I am, I might say in the interest of science, obliged to take care 
that at least my voice does not die away, and therefore an answer, as 
quick as possible, is essential. 

Now I come to a delicate point. Prof. Max Planck is explicitly men- 
tioned as a collaborator of the Annals, and Zermelo is his student. 

I believe to be justified in demanding: 1. that Mr. Planck will not 
delay the appearance of my reply, 2. that no word in this reply is al- 
tered, 3. that no reply appears in the same issue; later he can answer 
what he wants and can. 

It is clear that Boltzmann felt Planck’s blessing behind Zermelo’s 1896a. 
Planck himself later reminisced feeling the same way (1946): 

In any case, he [i.e., Boltzmann] answered young Zermelo with 
scathing remarks that also hit me, because actually Zermelo’s pa- 
per had appeared with my permission. 

Also, in a letter to his friend Leo Graetz of May 23, 1897, Planck expressed 
his views on the Zermelo-Boltzmann controversy (cf. Kuhn 1978, 27): 

On the main point I side with Zermelo, in that I think it altogether 
hopeless to [attempt to] derive the speed of irreversible processes — 
e.g. viscosity or heat conduction in gases — in a really rigorous way 
from contemporary gas theory. Since Boltzmann himself admits that 
even the direction in which viscosity and heat conduction act can be 
derived only from considerations of probability, how can it happen 
that under all conditions the magnitude of these effects has entirely 
determinate values? Probability can serve, if nothing is known in ad- 
vance, to determine the most probable state. But it cannot serve, if 
an improbable [initial] state is given, to compute the following [state]. 
That is determined not by probability but by mechanics. To maintain 
that change in nature always proceeds from [states of] lower to higher 
probability would be totally without foundation. [. . .] Zermelo, how- 
ever, goes further [than I], and I think incorrect [ly] . He believes that 
the second law, considered as a law of nature, is incompatible with 
any mechanical view of nature. The problem becomes essentially dif- 
ferent, however, if one considers continuous matter instead of discrete 
mass-points like the molecules of gas theory. I believe and hope that 
a strict mechanical interpretation can be found for the second law 
along this path, but the problem is obviously difficult and requires 
time. 




Introductory note to 1896a, 1896b, and Boltzmann 1896, 1897 197 



It is clear from this letter that Planck and Zermelo did not agree com- 
pletely, and that Zermelo was not in fact merely Planck’s mouthpiece. 6 

However, although Boltzmann was right in recognizing Planck’s patron- 
age behind Zermelo’s 1896a, and given that Planck had made it explicit 
how he regarded Boltzmann’s mechanistic approach as lacking in fruifulness, 
Zermelo’s paper is remarkably even-handed in treating the mechanistic and 
the thermodynamic viewpoints: it simply presents a logical dilemma between 
these two viewpoints without taking sides. 



2. Boltzmann’s work in the kinetic theory of gases 

Before going into the subject of the Zermelo-Boltzmann controversy, I will 
now try to review how Boltzmann’s work in the kinetic theory of gases evolved 
in the preceding years. From 1866 onwards, Boltzmann wrote an impressive 
number of papers and books dealing with statistical physics, in particular on 
the kinetic theory of gases. 

The basic assumption of the kinetic theory of gases, as it developed in the 
19th century, is that a gas may be modeled as a large but finite number N of 
tiny particles moving in accordance with the laws of mechanics, occasionally 
colliding with each other as well as with the walls of the container in which 
it is enclosed. The basic tool of the theory, introduced by Maxwell (1860), is 
to represent the state of such a gas by a smooth distribution function /(v) 
such that /(v)dv provides the relative number of particles moving with a ve- 
locity between v and v + dv. Of course, by modern standards, one would say 
that such a mode of representation involves some approximations or idealiza- 
tions, since the actual number of particles with their velocity in some given 
range will always be a natural number, and hence /(v) cannot be smooth. 
Nevertheless, the idea seems reasonable enough, under the approximative as- 
sumptions that N is really huge, and the diameter (or range of interaction) 
of the molecules is very tiny. In any case, none of the 19th century authors 
in kinetic theory worried too much about how or under what limit this mode 
of representation could be made exact. 

Maxwell had famously argued in 1860 and 1867 that for the special case 
when the gas was in thermal equilibrium, the corresponding distribution func- 
tion takes the form of a normal or Gaussian distribution (in this context often 
referred to as the Maxwell distribution law): 

/ eq (v) = Ae~ v2 / B , (1) 



However, I can find no written statement by Zermelo endorsing the view at- 
tributed to him by Planck that the strict validity of the second law is incompat- 
ible with mechanicism tout court, i.e. even for continuum mechanics. It is quite 
clear that the very basis of Zermelo’s objection, i.e. the recurrence theorem, does 
not hold in that context. 




198 Jos Uffink 



where A is just a normalization constant and B is proportional to the abso- 
lute temperature of the gas. Although Maxwell investigated many inferences 
drawn from this distribution law, some of which were quite unexpected and 
yet turned out to be confirmed by subsequent experiment, he never inves- 
tigated systematically what would happen if the gas was not in thermal 
equilibrium. 

Ever since 1868, Boltzmann had been closely studying Maxwell’s work on 
kinetic theory, and extended it in various directions (e.g. to the case where 
the molecules are subject to an external force field). In 1872 he proposed an 
argument to deal with the case when the gas was not in thermal equilibrium. 
Boltzmann assumed that in such a case one could still represent the state 
of the gas by a distribution function / t (v), where the index t indicates that 
this function may change in the course of time. He argued that the evolu- 
tion of ft in time, as a consequence of the collisions between the particles, 
should obey a particular evolution equation, now known as the Boltzmann 
equation. And although this equation was far too difficult to solve, Boltz- 
mann was nevertheless able to show that his evolution equation implied a 
most important result: For the functional H defined on the distribution func- 
tions by H[ft] := / /t(v) ln/ t (v)</v, the equation implies that H[f t ] can only 
change monotonically in time, and becomes stationary in time only if / is the 
Maxwell distribution (1). In short, Boltzmann claimed to have shown that 
if a gas was not yet in a state of thermal equilibrium, it would necessarily 
evolve towards a final state of equilibrium in the course of time. 

It is clear that this particular result of Boltzmann, which is now known as 
the //-theorem, was considered the most impressive of his achievements by his 
contemporaries. For example, the proposal to elect Boltzmann to the Prus- 
sian Academy of Sciences (1888), written by the academicians Hermann von 
Helmholtz, Leopold Kronecker, Wilhelm von Bezold and Werner von Siemens, 
states what these authors considered to be Boltzmann’s main achievement 
( Kirsten and Korber 1975, 109): 

The main work in his life is the kinetic theory of gases. In particular, 
he has proved that the law of distribution of the various values of 
the velocities, which Maxwell had only verified as a correctly guessed 
hypothesis, must actually be the necessary form of the final state, as 
a consequence of the collisions between the molecules. In this work he 
has shown a high degree of capability of abstraction, and the ability 
of conquering extremely difficult and involved problems. 

Nevertheless, not all of Boltzmann’s contemporaries accepted the H- 
theorem at face value. Loschmidt ( 1876a, b , 1877a,b) was the first to question 
the general validity of this theorem, and later in 1894-5 a flurry of papers 
appeared in Nature after Edmund P. Culverwell (1894) asked the seemingly 
innocent question “Will someone say exactly what the //-theorem proves?”. 

I cannot do justice here to the debates about the //-theorem that preceded 
the Zermelo-Boltzmann controversy. I limit myself to only a few remarks. 




Introductory note to 1896a, 1896b, and Boltzmann 1896, 189 7 199 

First, taking the ahistorical perspective of hindsight, in the derivation of 
the Boltzmann equation, on which the //-theorem depends, Boltzmann had 
to rely on a crucial assumption about the collisions between pairs of particles 
in the gas, which is now known as the Stofizahlansatz. Roughly speaking, the 
Stofizahlansatz states that the velocities of any two particles entering into a 
collision are statistically independent before the collision — although a similar 
independence is not demanded for their velocities after the collision. This as- 
sumption had already been explicitly made by Maxwell in 1867 (which might 
explain why Boltzmann regarded it as uncontroversial or self-evident). But 
then, Maxwell did not enter into the question of the evolution of the distri- 
bution function but focused on the problem of characterizing the distribution 
corresponding to the thermal equilibrium state, so that in Maxwell’s case the 
assumption was not used for explaining time-asymmetric phenomena. 

However, Boltzmann (1872) had extended Maxwell’s investigations to the 
question of describing the evolution of the distribution function, considered 
at any time t, and the claim of his //-theorem is that this function must 
approach a Maxwellian distribution at later times, but not at earlier times, 
i.e. , the //-theorem is not time-reversal invariant. 

And so, when the issue is raised of how the asymmetry under time re- 
versal embodied in the //-theorem could be reconciled with a perfectly time 
reversal invariant theory as the mechanics of particles, modern commentators 
will point out that Boltzmann had relied on the Stofizahlansatz, which treats 
pre-collision coordinates differently than post-collision coordinates, and that 
this assumption is responsible for breaking the underlying symmetry between 
the past and future directions of time in mechanics and is crucial to under- 
standing the //-theorem. But Boltzmann had not stated his assumption in 
any detail. Indeed, here is the only passage in his 1872 paper devoted to the 
Stofizahlansatz (1872, 323): 

The determination [of the number of collisions] can only be obtained 
in a truly tedious manner. [...] But since this determination has, apart 
from its tediousness, not the slightest difficulty, nor any special inter- 
est, and because the result is so simple that one might almost say it 
is self-evident I will only state the result. 

Obviously, by avoiding details and calling the issue “self-evident” and 
claiming that nothing of interest was at stake, Boltzmann did not recog- 
nize, or at least failed to alert his readers to, the crucial significance of the 
Stofizahlansatz to his //-theorem. 

Indeed, when Boltzmann was subsequently challenged by Loschmidt in 
1876 on the question how the time-asymmetry of the //-theorem could be rec- 
onciled with the time-symmetry of the mechanical laws of motion, he did not 
point to the Stofizahlansatz, but only claimed, without proof, that although 
there were conceivable states of a gas that would violate the conclusion of his 
//-theorem, such exceptions were extremely improbable. Thus, this exchange 




200 Jos Uffink 



did not bring along any clearer recognition of the assumptions involved in 
the derivation of the If- theorem. 

It is only in Boltzmann 1894, i n the very paper that reviewed Planck’s 
edition of KirchhofPs work in the theory of gases and just prior to the 1895 de- 
bate in Nature, that I can find Boltzmann stating the claim clearly that, in gas 
theory, for any pair of molecules entering into a collision, their pre-collision ve- 
locities should be regarded as independent, but their post-collision velocities 
should not. As mentioned earlier, KirchhofPs Vorlesungen uber Warmetheo- 
rie (1 894 ) discusses the kinetic theory of gases (but only in equilibrium) and 
assumes that the probability that a pair of molecules simultaneously have 
positions and velocities in the regions <5xi<5vi and <5x2<5v 2 is proportional to 

/(xi,vi)/(x 2 ,v 2 ) . (2) 

Boltzmann’s critique in 1894 distinguishes three cases: 

1. These molecules are about to collide with each other. 

2. These molecules have just collided with each other. 

3. All other cases. 

He then writes: 

In the second case [. . . ] one cannot consider the presence of a molecule 
in its region as independent of the presence of the other particle in 
its region. 

Thus, according to Boltzmann 1894, the assumption of statistical indepen- 
dence between pairs of particles is all right immediately before collisions, but 
not immediately after. However, the paper does not discuss the If -theorem, 
so it would still not have been evident to his readers that the assumption 
stated in this claim is in fact crucial to the proof of that theorem. And apart 
from this, Boltzmann did not discuss the motivation for his claim, and it 
might be that he still regarded its validity as self-evident. 

The next occasion at which the problem was discussed of how the H- 
theorem could be reconciled with, or even derived from, the time-reversal 
invariant laws of motion is the exchange that took place in the columns of 
Nature in 1895. This exchange occurred after Boltzmann’s visit to Oxford, 
where he received a honorary doctorate and gave a lecture at a meeting of 
the British Association for the Advancement of Science. In the wake of this 
meeting, Culverwell published a short paper (1894) in Nature in which he 
pointed out a “palpable absurdity” in the statement of the If -theorem, as 
he had understood it from a presentation in Watson’s textbook 1893. The 
“absurdity” was that any time-asymmetrical result could be obtained from 
a derivation that only contained time-symmetrical assumptions. Culverwell 
says that he has not seen Boltzmann’s own proof, but assumed it to be all 
right. Yet, he remained worried about the idea that such a proof could exist at 




Introductory note to 1896a, 1896b, and Boltzmann 1896, 1897 201 



all, and ended his letter with the innocently-sounding question: “Will someone 
say exactly what the //-theorem proves?” 

This question gave rise to about a dozen letters to Nature, each attempt- 
ing to explain the nature of the proof. However, only the contributions of 
Samuel H. Burbury, George Hartley Bryan, and Boltzmann himself are worth 
further consideration. Burbury (1894a, b) is the first author to state clearly 
the logic behind Boltzmann’s //-theorem: this theorem depended on a spe- 
cial assumption, which he called “Condition A”, that he regarded as breaking 
the underlying invariance of time-reversal invariant mechanics, and there- 
fore could not itself be grounded in classical mechanics. However, Burbury’s 
statement of his Condition A was obscure, and actually does not provide the 
required break in time-reversal invariance. Moreover, in later contributions, 
Burbury showed a remarkable flexibility in what he actually meant by it 
(cf. Dias 1994). 

While Burbury thus succeeded in clarifying the logic behind Boltzmann’s 
//-theorem but failed in pinpointing the relevant physical content of the as- 
sumption needed, the contribution by Bryan may be said to have the oppo- 
site qualities. Bryan (1894) was the first to identify what we now call the 
Stoflzahlansatz as the missing ingredient in the proof of the theorem. He 
argued that a violation of this condition would require that the particles col- 
liding be endowed with “the power of forethought” to regulate their motions 
so as to move away from equilibrium. As he saw it, the condition was the 
only natural and reasonable one to be imposed, if the particles “are allowed 
to take their own natural course and nothing special is known about them”. 

When Boltzmann entered this exchange in 1895, he stated for the first 
time explicitly that his //-theorem did not rely on mechanics alone 
(1895b, 414): 

Though interesting and striking at the first moment, Mr. Culverwell’s 
arguments rest, as I think, only upon a mistake of my assumptions. It 
can never be proved from the equations of motion alone, that the min- 
imum function H must always decrease. It can only be deduced from 
the laws of probability that if the initial state is not selected for some 
special purpose, but haphazard [ness] governs freely, the probability 
that H decreases is always greater than that it increases. 

Indeed, in what I shall call the statistical reading of the //-theorem, he now 
makes the following claims: If the number of molecules is very large but finite, 
the //-function will almost always have the following properties: 

1. Most of the time it remains close to its minimum value. 

2. Only in the rarest cases does the curve rise to a peak (or “hump”) above 
this minimum value. 

3. The probability of a peak decreases rapidly with height. 

4. Whenever the value of H is very close to its minimum, the velocity dis- 
tribution is almost Maxwellian. 




202 Jos Uffink 



In his last contribution to the exchange (1895c), Boltzmann praised Bur- 
bury for pointing out that a non-mechanical assumption (Condition A) was 
needed and that this constituted “the weakest point in the derivation of the 
//-theorem”. However, his formulation of the additional assumption was even 
less clear than Burbury’s: 

Condition A is simply this: that the laws of probability are applicable 

for finding the number of collisions. 

The upshot of this exchange was thus that Boltzmann explicitly stated 
that his //-theorem was not a result of pure mechanics (although Boltz- 
mann’s claims that he had already said this in his earlier papers may re- 
main disputable). But the exchange failed, at least in Boltzmann’s writing, 
to bring about a clear recognition of exactly what additional ingredient was 
needed. Burbury’s statement of Condition A had been obscure and flexi- 
ble. Boltzmann’s own formulation of what ingredient was needed (viz. that 
“haphazard [ness] governs freely” or that “the laws of probability are applica- 
ble”) also lacked sufficient clarity. 

Boltzmann’s most definitive statement about what assumption is needed 
in the derivation of the //-theorem is his discussion presented in his Vor- 
lesungen ilber Gastheorie, which were published in two installments: 1896a 
(written before his controversy with Zermelo) and 1898 (written afterwards). 
In 1896a, Boltzmann again admits that the derivation of the //-theorem re- 
quires the assumption of a special condition and credits Burbury for pointing 
this out. Boltzmann now calls this condition the hypothesis of molecular 
disorder. Unfortunately, Boltzmann is still not very clear about what this 
assumption amounts to. 

On the one hand, 1896a first introduces the notion of molecular disorder 
without giving a formal definition, but as stating that a molecularly ordered 
distribution is one in which “groups of two or a small number of molecules 
exhibit definite regularities.” On the other hand, Boltzmann also discusses 
an equation which (apart from notational differences) expresses the idea that 
pairs of particles that are about to collide should be regarded as independent 
before they collide, and argues that the validity of that equation could be 
taken as the definition of the statement “the distribution is molecularly dis- 
ordered”. In this latter case, he was formulating the assumption in a manner 
very close to what we today call the St.ofizahlansatz. 

Summing up, we can say that Boltzmann had been emphasizing more and 
more strongly in various places that his //-theorem should be given a sta- 
tistical reading and did not rely on mechanical assumptions alone. However, 
apart from Bryan (1894), none of the contributors to this exchange succeeded 
in stating the required additional assumption clearly, nor pointed out how 
the additional assumption broke time-reversal symmetry. Indeed, it is ques- 
tionable whether Boltzmann himself fully recognized that his hypothesis of 
molecular disorder involved a time-asymmetrical element. The litmus test for 
this issue is, of course, the question of whether a molecularly (dis)ordered 




Introductory note to 1896a, 1896b, and Boltzmann 1896, 1897 203 



distribution will transform into another (dis) ordered distibution if we reverse 
all the velocities of all the particles. But even here Boltzmann seemed to be 
of two minds. In his 1896a he wrote ( Boltzmann 1964, 60): 

A molecular disordered distribution after reversal of all velocities can 
transform into a molecular-ordered one. 

which suggests that molecular disorder in not invariant under the transfor- 
mation. But in the second part of his Vorlesungen iiber Gast.heorie (1898) he 
expressed the opposite viewpoint (Boltzmann 1964, 442): 

The [molecular] ordered states are not related to the disordered ones 
in the way that a definite state is to the opposite state (arising from 
the mere reversal of the directions of all motions), but rather the 
opposite of each ordered state is again an ordered state. 

In any case, Zermelo ignored much of these developments. It is not possible 
for me to judge whether that was because he was not fully aware of this recent 
gloss Boltzmann had been putting on his H- theorem, or whether he regarded 
those formulations of the ^/-theorem too vague to analyze and for that reason 
chose to focus exclusively on that what could be stated clearly. 



3. The Zermelo-Boltzmann controversy 

3.1. Zermelo 1896a 

In his 1896a , Zermelo aims to present a discussion of Poincare’s recurrence 
theorem (Poincare 1890) and to point out its consequences for statistical 
physics, in particular for the kinetic theory of gases. In modern terms, this 
theorem is commonly presented as follows: 

Theorem 3.1 (Recurrence theorem). Let (T, K, n, {T t }) be any dynamical 
system such that /i(T) < oo, and let A £ K be any measurable set in T. 
Consider any r > 0 and let 

B = {x £ r : x £ A & Vt > r (T t x ^ A )} , (3) 

the set of states in A that after time r have left A and will never return to A. 
Then 

9(B) = 0. (4) 

Here, a dynamical system (T, K, /x, (T)}) consists of a measure space (T, K, fj) 
in the sense of measure theory. 7 To turn a measure space into a dynamical 



7 That is, K is a a-algebra consisting of all measurable subsets of P, and g is a non- 
negative valued-function on N satisfying /i(0) = 0 and pdjfei A i) = 9 (Af) 

for all countable sequences of mutually disjoint measurable subsets Ai of F. 




204 Jos Uffink 



system, one further assumes the existence of a one-parameter group of mea- 
surable evolution operators T t : T — > T for all t € R which has the group 
property T t o T t ' = T t+t i and is measure preserving: 

Vt S M , VA G H : n(T t A) = n(A) . (5) 

Of course, neither Poincare nor Zermelo had recourse to concepts of mea- 
sure theory nor to the theory of dynamical systems, but their statements that 
only “exceptional states” x would have the property that they never return 
to a set where they were initially located, and their elucidation that the qual- 
ification “exceptional” means that those points only make up an “extension” 
(where we would say “measure”) zero, indicate that their grasp of the concept 
was already firm. 8 Poincare and Zermelo both state their theorem only for 
the special case of the Lebesgue measure, i.e. the special choice of measure 
that makes open sets in a Euclidean phase space measurable and assigns a 
measure value to such sets which equals their Euclidean volume. However, 
the validity of the recurrence theorem does not depend on that choice. 

Zermelo notes immediately that this theorem implies an objection to the 
kinetic theory of gases and argues that the latter needs to be “fundamentally 
revised”. He further states that Poincare had failed to draw attention to the 
consequences of this theorem for the kinetic theory of gases. He is wrong 
about this: Poincare had done so in his 1893c, in the Revue de Metaphysique 
et Morale, a journal that may not have been part of the staple literature that 
Zermelo or other physicists regularly read. Actually, Poincare’s 1893c version 
of the objection is in an important sense different from the way Zermelo 
framed it: Poincare phrases the objection as a conflict between mechanicism 
and experience; whereas Zermelo presents it as a contradiction between two 
physical theories: mechanics and thermodynamics. Indeed, Zermelo’s paper 
never brings empirical considerations into the discussion. 

Zermelo starts in 1896a by giving a proof of the theorem. He considers a 
system of N material points and assumes that their motion is governed by 
some first-order differential equations [(l)]. 9 He argues that these equations 
of motion will be integrable, i.e., that they guarantee that for each time t a 
later state Pt will correspond to every initial Pq at time t = 0, as given by the 
solutions [(3)]. In the formulation given above (where states are denoted by 
x rather than P ), this corresponds to the idea that the evolution operators 
T t : r — > r are defined with T t : x H > T t x = Xt- 

Actually, Zermelo’s treatment here is a bit sloppy: the conditions he men- 
tions on the equations of motion do not actually guarantee the correspondence 

8 Actually, not all the ingredients of the theory of dynamical systems are needed 
to guarantee the validity of the recurrence theorem. Essential in this theorem are 
the assumptions that p(T) is finite and that Tt preserves measure (i.e. equation 
(5)); but otherwise, one could, e.g., also employ the weaker assumption that {Tt} 
is just a one-parameter semigroup instead of a group (cf. Brown et al. 2009). 

9 The notation “[(n)]” is used to refer to the equations in Zermelo 1896a. 




Introductory note to 1896a, 1896b, and Boltzmann 1896, 1897 205 



of a unique later state to each initial state; for that purpose the functions X M 
in [(1)] (or in the Hamiltonian equations introduced later) must not only be 
continuous, but also Lipschitz-continuous. (Poincare (1890) had also failed 
to mention this condition.) This omission was pointed out, however, in Boltz- 
mann 1897b. 

Next, Zermelo introduces a set go of initial states obeying inequalities of 
the form F(xq ) < 0 for some unspecified, but presumably continuous, func- 
tions F. If continuity is assumed, this makes go (i.e. A in the formulation 
above) the “continuously extended area”, as Zermelo calls it, i.e., an arbitrary 
open set in r. Actually, nothing in the proof of the recurrence theorem hinges 
on this assumption; the theorem as formulated above holds for any measur- 
able set A and does not depend on topological ingredients. However, taking 
the set A to be open (in the Borel topology on r) implies that it has positive 
(Lebesgue) measure g(A) > 0. This consequence is important in the subse- 
quent argument that the subset B C A oi initial states that never return to 
A after time r must be “exceptional” or “singular” because g(B) = 0. After 
all, if i-i(A) = 0 is allowed to begin with, the result that ft(B) = 0 would not 
sustain the claim that the states in B would have to be exceptional in A. 

Zermelo considers the “extension” jo of go, i.e. the (Lebesgue) measure of 
the set go, 7o = and notes that by Liouville’s theorem 10 this measure 

is preserved through time, i.e., if gt = T t go' 



it ~ o(gt) = n(go) = 7o • 



(6) 



Next, Zermelo considers the union of the sets gt., 11 

Go = |J 9t • ( 7 ) 

t> o 

Now consider the time evolution of this set Go, i.e. G T = T t Gq, for r > 0. 
On the one hand, it is easy to show that G T = [J t> 5t> which implies that 
G t C Go- As Zermelo puts it: “this change [of G T during its evolution in 
time] is due always to the disappearance of early states, but never to the 
appearance of new states.” On the other hand, by Liouville’s theorem, the 
measure of Go is also conserved under the evolution, so that il(G t ) = /i(Go). 

10 Actually, Zermelo’s paper may well have been one of the first times that the result 
is called “Liouville’s theorem”. Boltzmann used the theorem on many previous 
occasions, but referred to it as “Jacobi’s theorem on the last multiplier”. 

11 By modern standards, one might prefer instead of (7) to consider a discrete 
version of this union, i.e. something like Go = (J^Lo 9 t + nT f° r some t > 0, since 
otherwise the set Go might fail to be measurable, i.e. not belong to the cr-algebra 
H. However, if we specialize to Lebesgue measure, and assume 7 1 open for all 
t > 0, as seems to be Zermelo’s intention, the problem evaporates, since even in 
an uncountable union such as (7), Go is still an open set and hence Lebesgue 
measurable. Indeed, these observations do not affect the validity of Zermelo’s 
proof. 




206 Jos Uffink 



It follows that the difference, the set of “disappearing states”, 

B-.= G 0 \G t = |J g t , (8) 

0<i<r 

is a measure zero set. But this set contains exactly those states that were in 
go but after a time lapse r never return to go- 

After providing this proof of the recurrence theorem, which is essentially 
the same as Poincare’s proof (even to the point of both failing to mention 
the need for Lipschitz-continuity) , Zermelo provides a corollary: 

Theorem 3.2 There exists no single-valued and continuous function S on T 
that has the following property: One can find an open set g, however small, 
such that Vx £ g, S(T t x) is monotonically increasing as a function of t. 

Interestingly, even before he had found the recurrence theorem, Poincare 
(1889) also had considered the question whether it was possible to define a 
function of the state that would have the property that it increases along the 
dynamicalal trajectories and claimed this was not possible. This paper, which 
was noticed by Zermelo, has been translated and analyzed by Elwoocl Olsen 
(1993). Olsen concluded that this attempt by Poincare was unconvincing. 
Poincare himself, apparently did not realize that his own recurrence theorem 
allowed a more secure basis for such a claim. 

Harvey Brown et al. (2009) gave a slightly extended formulation of Zer- 
melo’s corollary in which the assumption of continuity of the function S is 
weakened to the assumption that S is an integrable non-negative function, 
and the open set g is replaced by an arbitrary measurable set. Their formu- 
lation is as follows: 

Theorem 3.3 Let (T, K, g, {T t }) be any dynamical system such that /-i(T) < 
oo, and let go be any measurable set in T and G = U^o for some time 

t > 0. Then there exists no integrable non-negative function S on T such 
that 

[ Sdg 

JT t (T t G ) 

is monotonically increasing as a function oft. 

Remarkably, this extended version is not only implied by, but is fully 
equivalent to, the recurrence theorem. Indeed, as long as we assume that 
the dynamical flow is smooth (which is guaranteed by assuming a Lipshitz- 
continuous Hamiltonian), for go an arbitrary measurable set in phase space, 
we can define a function S on T by 

S(x) := inf{f > r : T t x £ go} • (9) 

In words, this function simply indicates, for any choice of x, how much time 
has elapsed (after a threshold time r) since the trajectory through x last 
passed through region go- If the trajectory through x never passes through 




Introductory note to 1896a, 1896b, and Boltzmann 1896, 1897 207 



go (and the infimum is over the empty set), we simply stipulate that S(x) = 0. 
Clearly, S(T t x) will be monotonically increasing in t iff the trajectory through 
x never recurs to go . 

After proving the two theorems mentioned above, Zermelo replaces the 
abstract differential equations of motion [(1)] by the familiar Hamiltonian 
equations of motion. He argues that all attempts to give an atomistic, me- 
chanical account of the behavior of matter, e.g. the kinetic theory of gases, 
fall under this Hamiltonian framework. 

Zermelo formulates the conclusion of his theorems as an obstacle for “irre- 
versible” processes. Unfortunately, he does not define that term. He may have 
meant by “irreversible” processes those that display a permanent approach 
towards equilibrium, i.e. processes in which any initial non-equilibrium state 
eventually move towards, and permanently remain within, a region of phase 
space associated with equilibrium. Another reading might be that he means 
by “irreversible” processes evolutions such that some continuous entropy-like 
function on phase space increases monotonically with time. Either way, it is 
clear that the recurrence theorem or its corollary imply that no Hamiltonian 
dynamical system can be irreversible. 

Zermelo carefully discusses various options for avoiding this conclusion: 

1. We can assume that the system has no bounded phase space. This could 
be achieved by allowing the particles (a) to reach unbounded positions in 
space, or (b) to attain unbounded velocities. Option (a) is however excluded 
by the assumption that a gas is contained in a finite volume. Option (b) 
could be achieved when the gas consists of point particles which attract each 
other at small distances (e.g. an F oc r~ 2 inter-particle attractive force can 
accelerate them toward arbitrarily high velocities). However, Zermelo argues 
on physical grounds that one ought to assume that there is always repulsion 
between particles at very small distances. 

2. We can assume that the particles act upon each other by velocity- 
dependent forces. This, however would lead either to a violation of the con- 
servation of energy or the law of action and reaction, both of which Zermelo 
regards as essential to atomic theory. 

3. The ff-theorem holds only for those special initial states which are the ex- 
ception to the recurrence theorem, and we assume that only those states are 
realized in nature. This option would be unrefutable, says Zermelo. Indeed, 
the reversibility objection has already shown that not all initial states can 
correspond to the second law. However, here we would have to exclude the 
overwhelming majority of all imaginable initial states, since the exceptions to 
the recurrence theorem only make up a set of total extension (i.e. measure) 
zero. Moreover, the smallest change in the state variables would transform a 
singular state into a recurring state, and thus suffice to destroy the assump- 
tion. Therefore, this assumption would be quite unique in physics and I do 
not believe that anyone would be satisfied with it for very long. 




208 Jos Uffink 



This leaves only two major options. 

4. The Carnot-Clausius principle is to be altered. 

5. The kinetic theory is to be formulated in an essentially different way, or 
even be given up altogether. 

Zermelo’s 1896a does not express any preference between these last two 
options. He concludes that his aim has been to explain as clearly as possible 
what can be proved rigorously and hopes that this will contribute to a re- 
newed discussion and final solution of the problem. (Of course, his next paper 
shows that his own preferences lie along option 5.) 

I would like to emphasize that, in my opinion, Zermelo’s argument is fair 
and entirely correct. If he can be faulted for anything, it is that he had not 
noticed that Boltzmann, in his very recent papers, had already been putting 
a different gloss on the U- theorem. 

3.2. Boltzmann 1896 

Boltzmann first states that he has repeatedly pointed out that the theorems 
of the kinetic theory of gases have the character of “statistical truths”, and 
refers to several of his previous works to substantiate the claim. 

The claim is fair enough: Boltzmann had pointed out the statistical aspect 
of his understanding of the theorems of the theory of gases (in particular 
the if -theorem) already in his reply (1877a) to Loschmidt, and with much 
more emphasis in his contribution to the debate in Nature (1895b) just a 
year before Zermelo’s article was written. However, I think it is also fair to 
note that Boltzmann never explained clearly what the exact nature or status 
of such “statistical truths” were, or in what sense they were independent 
from mechanical considerations. Moreover, it also seems fair to note that 
in spite of Boltzmann’s repeated warnings that such theorems had to be 
taken in a statistical sense, there are also occasions, not just in 1872 but 
also fairly recently before the debate with Zermelo, where he did not refer 
to statistical considerations, but claimed that the theorems of the theory of 
gases were rigorous analytical theorems from mechanics alone. For example, 
in 1 892, after giving yet another proof of the Maxwell distribution in thermal 
equilibrium, he concluded (p. 432): “I believe therefore that its correctness 
[i.e. that of the Maxwell distribution] as a theorem of analytical mechanics 
can hardly be doubted.” 

As I said earlier, Boltzmann’s writings are not easy to interpret, even to- 
day. Zermelo may well have chosen to avoid the labor of going into an analysis 
of what Boltzmann might have meant and to focus instead on stating what 
could be deduced from Poincare’s theorem. Yet it is equally understandable 
that Boltzmann might have felt misunderstood by Zermelo’s paper, since 
he had occasionally emphasized that the H - theorem and the permanence 
of Maxwell distribution should not be conceived of as rigorous theorems in 
mechanics. 




Introductory note to 1896a, 1896b, and Boltzmann 1896, 1897 209 



Boltzmann’s rejoinder, which he apparently wrote within two days, starts 
with a sarcastic remark, claiming that he “cannot but take delight” in Zer- 
melo’s essay as “the first proof that these works of mine receive any attention 
at all in Germany”. 12 

Boltzmann correctly points out that if the number of molecules is infi- 
nite, the recurrence theorem does not apply (because the phase space is not 
bounded in this case). 

Boltzmann then repeats the claims 1. to 4. (cf. p. 201) he made in his 
contribution 1895b to the Nature debate, and which I called the statisti- 
cal reading of the 77-theorem. The only remark I can add is that although 
Boltzmann presented these claims in his 1895b as having been “proved in my 
papers” and in his present reply introduced these claims by the phrase, “As I 
have already shown in the contribution to Nature ”, he actually never gave a 
demonstration of the validity of these claims. Indeed, most of these claims are 
too vague to admit of a rigorous demonstration. In particular, an ambiguity 
which persists in much of Boltzmann’s writings is the question whether he 
intends “probability” to be measured by duration in time or by measure in 
phase space. Thus, when he speaks about a hump in the 77-curve occurring 
“only in the rarest cases” above, the intended meaning could be that such 
humps are rare in time, but also that they only occur for very rare choices 
of the initial states — or both. And then there is the question of whether and 
how the Stoftzahlansatz or similar conditions might be needed to demonstrate 
their validity, which Boltzmann leaves completely untouched. Thus, up to this 
point, Boltzmann describes how his own views on the interpretation of his 
1872 77-theorem had developed. 

Next, Boltzmann states a first disagreement with Zermelo: Zermelo be- 
lieves that it is only for exceptional initial states that the gas comes ever closer 
to satisfying Maxwell’s law of distribution, and that this does not seem right 
to him. Instead, Boltzmann claims, it is only for exceptional initial states 
that the Maxwell distribution never holds while for the vast majority of ini- 
tial states the 77-curve has the properties just stated. This disagreement, 
however, is only an optical illusion. When Zermelo stated (1896a) that “it 
is [. . . ] impossible to show [. . .] the well-known law of velocity distribution 
among gas molecules to be the stationary final state regularly reached after 
some time, as its discoverers, Maxwell and Boltzmann intended to”, he clearly 
meant with a “stationary final state” a condition that was permanent, and 
would never be changed at any later time. When Boltzmann claimed that 
it is only for exceptional states that the Maxwell distribution will never be 
reached for some period of time, he is of course also right, at least for typical 
gas models. But this claim does not contradict Zermelo’s at all, because the 
quantifiers are in a different order. 



12 This remark seems somewhat odd, since Boltzmann’s previous call to the Univer- 
sity of Berlin, his election to the Prussian Academy, and his previous debate with 
Planck made clear that Boltzmann’s work did not escape notice in Germany. 




210 Jos Uffink 



In fact, Boltzmann subsequently admits the validity of the recurrence 
theorem, but sees this as fully consistent with his own approach. However, 
he disputes the conclusion that the mechanical approach should be modified 
or even abandoned. He argues that this conclusion would only be justified 
if the mechanical approach violates our experience. To show that this is not 
the case, Boltzmann argues by means of a thought experiment, elaborated 
in the appendix of his paper, that the recurrence time for even a cubic cm 
of gas could be truly enormous (10 10 seconds) and hence utterly escape 
observation. 

From a historical point of view it is interesting that Boltzmann points out 
other examples of improbable yet not impossible states of gas that would not 
require a quasi-recurrence to an original state, like fluctuations in pressure 
or chemical transformations at temperatures below the reaction threshold. 
Boltzmann argues that such improbable transitions have actually been ob- 
served. The most pregnant of his remarks is that “observations were made of 
movements of very small corpuscles, which may be due to the fact that in 
such cases a pressure which is sometimes a little greater, sometimes a little 
smaller, really acts on a part of their surface that no longer vanishes com- 
pared to their entire surface.” Here, Boltzmann seems to be referring to what 
we now know as Brownian motion, the subject that Einstein would deal with 
in much more detail in 1905 and that would eventually contribute to a much 
wider acceptance of the reality of molecules. If Boltzmann had paid more 
attention to how the predictions of the theory of gases relate to such obser- 
vations and had played down this empirical card, which he held in his sleeve, 
with more emphasis, he might have anticipated Einstein in this regard. 
Boltzmann ends with a biting remark: 

All objections against the mechanical approach to nature are there- 
fore unfounded and based on mistakes. Anyone unable to overcome 
the difficulties attendant on a clear understanding of the principles 
of the theory of gases really ought to heed Mr. Zermelo’s advice and 
resolve to abandon the theory altogether. 

Given the fact that Zermelo had set out to ascertain what can be proved 
rigorously and what can not, Boltzmann’s hasty response may be rather 
disappointing, since he provided no clear statement of how he understood the 
statistical reading of the H-theorem or of the conditions on which it relied. 
Still, Boltzmann is obviously correct when he says that Zermelo’s objection 
did not lead to a conflict between the theory of gases and experience. Thus, 
his argument would have been more successful as a counter-argument to 
Poincare than to Zermelo. 

3.3. Zermelo’s response 1896b 

In his 1896b , Zermelo notes that Boltzmann’s response confirms his views by 
admitting that Poincare’s theorem is correct and applicable to a closed system 




Introductory note to 1896a, 1896b, and Boltzmann 1896, 1897 211 



of a finite number of gas molecules. Hence, in such a system, all [sic!] motions 
“are periodic and hence strictly non-irreversibld\ Thus, the kinetic theory of 
gases cannot assert that there is a strict monotonic increase of entropy as the 
second law would require. He adds that this general clarification was not at 
all superfluous. 

Therefore, Zermelo argues, his main point had been conceded: there is 
indeed a conflict between thermodynamics and kinetic theory, and it remains 
a matter of taste which of the two is to be abandoned. Zermelo admits that 
observation of the Poincare recurrences may well fall beyond the bounds 
of human experience. He points out (correctly) that Boltzmann’s estimate 
of the recurrence time presupposes that the system visits all other cells in 
phase space before recurring to an initial state. This estimate is inconclusive, 
since the latter assumption is somewhat ad hoc. In general, these recurrence 
times need not “come out so ‘comfortingly’ large”. But, as I stressed before, 
the relation with experience simply was not an issue in Zermelo’s objec- 
tion. 

The main part of Zermelo’s reply analyzes the justification of and con- 
sequences drawn from Boltzmann’s assumption that the initial state is very 
improbable, i.e., that Hq is very high. Zermelo argues that even in order to 
obtain an approximate or empirical analogue of the second law as Boltzmann 
envisaged, i.e. an approach to a long-lasting, but not permanent equilibrium 
state, it would not suffice to establish this result for one particular initial 
state. Rather, one would have to show that evolutions always take place in 
the same sense, at least during observable time spans. 

As Zermelo understands it, Boltzmann does not merely assume that the 
initial state has a very high value for H, but also that, as a rule, the initial 
state lies on a maximum or has just passed a maximum. If this assump- 
tion is granted, then it is obvious that one can only observe a decreasing 
flank of the If -curve. However, Zermelo protests, one could have chosen any 
time as the initial time. In order to obtain a satisfactorily general result, the 
additional assumption would thus have to apply at all times. But then the 
If -curve would have to consist entirely of maxima. This leads to nonsense, 
Zermelo argues, since the curve cannot be constant. Zermelo concludes that 
Boltzmann’s assumptions about the initial state are thus in need of further 
physical explanation. 

Further, Zermelo points out that probability theory, by itself, is neutral 
with respect to the direction of time, so that no preference for evolutions in 
a particular sense can be derived from it. He also points out that Boltzmann 
apparently equates the duration of a state and its extension (i.e. the relative 
time spent in a region and the relative volume of that region in phase space). 
“That he has actually demonstrated this property for his function H [. . . ] 
I fail to see, since, in my view, probability and duration of a state are not 
identical” (1896b, 796). 




212 Jos Uffink 



3.4. Boltzmann’s second reply 189 7 

In his second reply Boltzmann rebuts Zermelo’s demand for a physical ex- 
planation of his assumptions about the initial state of the system with the 
claim that the question is not what will happen to an arbitrarily chosen ini- 
tial state, but rather what will happen to a system in the present state of the 
universe. 

He argues that one may depart from the (admittedly improvable) as- 
sumption that the universe (or at least a very large part of the universe) that 
surrounds us started in a very improbable state and still is in an improba- 
ble state. If one then considers a small system (e.g. a gas) that is suddenly 
isolated from the rest of the universe, there are the following possibilities: 
(i) The system may already be in equilibrium, i.e. H is close to its minimum 
value. This, Boltzmann says, is by far the most probable case. But among the 
few cases in which the system is not in equilibrium, the most probable case 
is (ii) that H will be on a maximum of the 17-curve, so that it will decrease 
in both directions of time. Even more rare is the case in which (iii) the initial 
value of H will fall on a decreasing flank of the H curve. But such cases are 
just as frequent as those in which (iv) H falls on an increasing flank. 13 

Thus, Boltzmann’s explanation for the claim that H is initially on a max- 
imum is that this is the most likely case for a system in a non-equilibrium 
state which becomes isolated from the rest of the universe in its present, 
improbable state. 

Boltzmann does not respond to Zermelo’s requests for more definite proofs 
of his claims, in particular the equality of averages over phase space volume 
and time averages. He bluntly states that he has thirty years of priority in 
measuring probabilities by means of phase space volume (which is true) and 
adds that he always had done so (which is false) . Indeed, Boltzmann continues 
to equivocate between these two senses of averaging: A few lines below, he 
claims that the most probable states will also occur most frequently, except 
for a vanishingly small number of initial states, but does not attempt to 
justify such a claim, in spite of Zermelo’s complaint that he could not find a 
proof of such claims. 

3.5. Concluding remarks 

Boltzmann’s replies to Zermelo have been recommended as “superbly clear 
and right on the money” ( Lebowitz 1999, 347). However, as will be clear 
from the above, I do not share this view. See also M. Klein 1973, Curd 1982, 
Batterman 1990, Cercignani 1998, Brush 1999, Earman 2006, and Frigg 2008 
for other commentaries on the Zermelo-Boltzmann controversy. 

13 The Ehrenfests later ( Ehrenfest and Ehrenfest 1912) added a final possible case 
(v): H may initially be on a local minimum of the H- curve, so that it increases 
in both directions of time. But by a similar reasoning, that case is even less 
probable than the cases mentioned by Boltzmann. 




Introductory note to 1896a, 1896b, and Boltzmann 1896, 1897 213 



It is clear that, in at least one main point of the dispute, Boltzmann and 
Zermelo had been talking past each other. When Zermelo argued that in the 
kinetic theory of gases there can be no continual approach towards a final 
stationary state, he obviously meant this in the sense of a limit t — > oo. But 
Boltzmann’s reply indicates that he took the “approach” as something that is 
not certain but only probable, and as lasting for a very long, but finite time. 
His graph of the //-curve (189 7, 398) makes abundantly clear that he does 
not intend to claim that lim^oo H(t) even exists. 

It is true that Boltzmann’s statistical reading of the //-theorem, which he 
stressed in his second reply, had already been explicit in 1895b , and thus he 
could claim with some justification that this development in his thinking had 
been overlooked by Zermelo. But in fairness, one must note that, even only 
just before the debate, Boltzmann had also expressed views which seemed to 
contradict this statistical reading of the //-theorem. Indeed, the first volume 
of Boltzmann’s Vorlesungen iiber Gastheorie (1896a) stressed, much like his 
original 1872 on the //-theorem, the necessity and exceptionless generality of 
the //-theorem, adding only that the theorem depended on the assumption 
of molecular disorder (1896a, § 5, 38): 

[T]he quantity designated as H can only decrease; at most it can 
remain constant. [. . . ] The only assumption we have made here is that 
the distribution of velocities was initially ‘molecularly disordered’ and 
remains disordered. Under this condition we have therefore proved 
that the quantity called H can only decrease and that the distribution 
of velocities must necessarily approach the Maxwell distribution ever 
more closely. 14 

The point here is not whether the claim that H can only decrease in the course 
of time could be derived from pure analytical mechanics: Boltzmann states 
clearly in this quote that an additional assumption of molecular disorder 
is involved, which, although its precise meaning might be unclear, seems to 
involve a motivation that goes beyond mechanics. Rather, the point is that in 
this quote Boltzmann still expressed the claim that, with this assumption in 
place, one could prove a necessary and steady approach towards equilibrium, 
without exceptions. This claim, of course, is quite different from Boltzmann’s 
statistical reading of the //-theorem in 1895b. Zermelo was thus at least 
equally justified in claiming that Boltzmann’s clarification “was not at all 
superfluous” (1896b, 793). 



14 In his reply to Zermelo, Boltzmann claimed that his discussion of the //-theorem 
in the Vorlesungen iiber Gastheorie was intended under the explicitly emphasized 
assumption that the number of molecules was infinite, so that the recurrence 
theorem did not apply. However, I can find no mention of such an assumption in 
this context. On the contrary, the first occasion on which this latter assumption 
appears is in §6 on page 46 where it is introduced as “an assumption we shall 
make later”, suggesting that the previous discussion did not depend on it. 




214 Jos Uffink 



To repeat, Boltzmann’s various presentations of claims about the H- 
theorem, and the assumptions involved, are not easy to digest, even today. 
To date, the most elaborate attempt to provide a version of a proof of a 
statistical //-theorem which remains close to Boltzmann’s original intentions 
is Lanford 1975 and the subsequent work by authors elaborating on this 
approach. However, it would take us to far afield to delve into that. 

To conclude, Zermelo’s dispute with Boltzmann in 1896-7 is rather well- 
known and has often been commented on before, both from a historical and 
from a philosophical foundations of physics perspective. In the popular lit- 
erature dealing with this episode, a picture has often been painted in which 
Boltzmann had the upper hand and Zermelo’s objections are described as 



Ueber einen Satz der Dynamik und die 
mechanische Warmetheorie 

1896a 



Im zweiten Kapitel der Poincare ’ schen Preisschrift iiber das Dreikorperpro- 
blem 1 hndet sich ein Satz bewiesen, aus welchem hervorgeht, dass die ver- 
breiteten Vorstellungen von der Warmebewegung der Molecule, wie sie z. B. 
der kinetischen Gastheorie zu Grunde liegen, einer wesentlichen Abanderung 
bediirften, um mit dem thermodynamischen Hauptsatze von der Vermeh- 
rung der Entropie vereinbar zu werden. Dieses Poincare ’ sche Theorem sagt 
aus, dass in einem System von materiellen Punkten unter Einwirkung von 
Kraften, die allein von der Lage im Raume abhangen, im allgemeinen ein 
einmal angenommener durch Configuration und Geschwindigkeiten charakte- 
risirter Bewegungszustand im Laufe der Zeit, wenn auch nicht genau, so dock 
mit beliebiger Annaherung noch einmal, ja beliebig oft wiederkehren muss, 
vorausgesetzt, dass die Coordinaten, sowie die Geschwindigkeiten nicht ins 
Unendliche wachsen. In einem solchen System sind daher, von singularen 
Anfangszustanden abgesehen, irreversible Vorgange unmoglich, es kann keine 
eindeutige und stetige Function der Zustandsgrossen wie die Entropie fort- 
wahrend zunehmen, da jeder endlichen Zunahme bei der Ruckkehr in den 
Anfangszustand wieder eine Abnahme entsprechen miisste. Hr. Poincare be- 
dient sich in der genannten Abhandlung seines Satzes zu astronomischen 
Erorterungen iiber die Stabilitat des Sonnensystemes, er scheint aber sei- 
ne Anwendbarkeit auf Systeme von Moleciilen oder Atomen und damit auf 
die mechanische Warmetheorie nicht bemerkt zu haben, wiewohl er gerade 

1 Poincare, „Sur les equations de la dynamique et le probleme des trois corps", 
Acta Mathematica 13. p. 1-270. 1890; der betreffende Satz p. 67-72. 




Introductory note to 1896a, 1896b, and Boltzmann 1896, 1897 215 



hostile, misguided, or wrongheaded. However, Zermelo clearly had the better 
arguments in this debate. Although it seems clear that he did not sympathize 
with Boltzmann’s approach, his objections were stated fairly and precisely. 
By contrast, Boltzmann’s own responses added a sense of hostility to the 
controversy, and failed to answer or fully elucidate his position on the ques- 
tions Zermelo was asking. Of course, this view on the Zermelo-Boltzmann 
controversy does not deny the fact that in the early decades of the 20th cen- 
tury, the mechanistic-atomistic approach championed by Boltzmann gained 
a clear victory over its alternatives. However, a clear and commonly accepted 
answer on the question how to explain irreversible phenomena in statistical 
physics has not been reached. 



On a theorem of dynamics and the 
mechanical heat theory 

1896a 



A theorem proved in the second chapter of Poincare's prize essay on the 
three-body problem 1 suggests that widespread ideas about the heat motion 
of molecules such as those underlying the kinetic theory of gases need to be 
fundamentally revised so as to become compatible with the law of thermody- 
namics concerning the increase of entropy. This theorem by Poincare states 
that, in a system, of material points under the action of forces depending only 
on location in space, a state of motion that is characterized by configura- 
tion and velocities and that has occurred once must in general occur again 
one more time, even if not as exactly the same, but at least in arbitrary 
approximation, and even arbitrarily many times, provided that neither the 
coordinates nor the velocities grow infinitely large. In a system of this sort, 
irreversible processes cannot possibly occur, with the exception of singular 
initial states, and no single-valued and continuous function of the state vari- 
ables such as entropy can always increase, since to every finite increase there 
would have to correspond a decrease when the system returns to the initial 
state. In his essay, Poincare uses his theorem for astronomical considerations 
on the stability of the solar system. However, it seems that he has failed to 
notice its applicability to systems of molecules or atoms, and hence to the 
mechanical theory of heat, his special interest in the foundational questions 



Poincare 1890\ the relevant theorem p. 67-72. 



l 




216 



Zermelo 1896a 



den Grundfragen der Thermodynamik besonderes Interesse zugewandt und 
auf einem anderen Wege den Nachweis versucht hat, dass die irreversiblen 
Vorgange aus der v. Helmholtz ’schen Theorie der „monocyclischen Systeme“ 
486 nicht immer | erklart werden konnen. 1 Um nun das Studium der umfang- 
reichen und vielen Physikern schwerer zuganglichen Poincare ' schen Arbeit 
nicht voraussetzen zu miissen, schicke ich einen moglichst einfachen Beweis 
des angefiihrten Satzes voraus. 

Sei N die Anzahl der materiellen Punkte und werden die n = 6 N Zu- 
standsgrossen, d. h. die 3 N Coordinaten und 3 N Geschwindigkeitscompo- 
nenten mit x\, x 2 , ■ ■ ■ x n bezeichnet, so sind die nach der Zeit genomme- 
nen Differentialquotienten der ersteren identisch mit den entsprechenden 
Geschwindigkeitscomponenten, die Ableitungen der letzteren aber, d. h. die 
Beschleunigungscomponenten, die Krafte, nach unserer Annahme eindeutige 
und stetige Functionen der Coordinaten. Jene sind also von den Coordinaten, 
diese von den Geschwindigkeiten unabhangig, und die Differentialgleichungen 
der Bewegung sind von der Form 

= X M (. Xi,X2 , ■■■Xn) 

(m= 1,2, ...n) , 




wo keine der Functionen X M die entsprechende Variable x M selbst enthalt, 
sodass die Beziehung besteht: 



dX x dX 2 dX n 

dx\ dx 2 dx n 



(2) 



In einem solchen Systeme (1) von Differentialgleichungen erster Ordnung ent- 
spricht einem beliebigen Anfangszustand Pq : 



Xl = Cl , X 2 = 6 • • • X„ = £ n , (t = t 0 ) 



ein bestimmter veranderter Zustand P zur Zeit t, ausgedriickt durch die 
Integralgleichungen von (1): 

Xfi = (1 *o, G , ^2 , ■ ■ ■ £,n) , 

0 = 1,2 , ...n) , 

wo die eindeutige und stetige Functionen ihrer sammtlichen Argumente 
sind, die, unabhangig von der Wahl des Zeitanfanges to, durch die Functionen 
A m allein bestimmt werden. Diese Beziehungen gelten eben so gut fiir vorher- 
487 gehende wie fiir nachfolgende Zeiten, d. h. eben so gut fiir | negative wie fiir 
positive Werthe von t — to', der Anfangszustand Pq ist eine beliebige, willkiir- 
lich hervorgehobene Phase der Bewegung, die nicht immer zeitlich voranzu- 

1 Poincare, Compt. rend. 108 . p. 550-552. 1889; „Vorles. fiber Thermodynamik 11 , 

p. 294-296. 





On a theorem of dynamics and the mechanical heat theory 217 



of thermodynamics notwithstanding, and even though he has tried to show 
by other means that the irreversible processes cannot always be explained 
on the basis of v. Helmholtz's theory of “monocyclic systems”. 2 To save the 
reader the trouble of delving into Poincare's work, which is extensive and not 
easily accessible to many physicists, I begin by providing as simple a proof 
as possible of this theorem. 

Let N be the number of material points and let the n = 6 N state vari- 
ables, that is, the 3 N coordinates and 3 N velocity components, be denoted 
by xi, X 2 , ■ ■ ■ x n . Then the derivatives of the former taken with respect to time 
are identical with the corresponding velocity components, while the deriva- 
tives of the latter, i.e., the acceleration components, the forces, are, by our 
assumption, single- valued and continuous functions of the coordinates. Thus, 
the former are independent of the coordinates, and the latter are indepen- 
dent of the velocities, and the differential equations of the motion have the 
form 



-jjr- = (®1j ®2i ---Xn) 

(/* = 1,2 ,...n) , 



(1) 



where none of the functions X M contains the corresponding variable x M itself 
so that the following relations obtains: 



dX 1 dXn dx n 

dx\ dx 2 dx n 



(2) 



In such a system (1) consisting of differential equations of first order to an 
arbitrary initial state Pq : 



x\ = £i , x 2 = & -x n = £ n , (t = t 0 ) 

there corresponds a particular altered state P at time t, which is expressed 
by means of the integral equation of (1): 

Xfi = Pfi {t to, £lj ?2> • • • £ra) j 

(P = 1) 2, . . . n) , 

where the ifi^’s are single- valued and continuous functions of all their ar- 
guments which are determined solely by the functions X^, independently 
of the choice of the starting time to- These relations hold equally for both 
earlier and later instants of time, i.e., for both negative and positive values 
of t — to; the initial state Pq is any arbitrarily specified phase of motion, 
which need not be the temporal predecessor. Likewise, to a continuously ex- 




Poincare 1889\ Poincare 1893b, p. 294-296. 



2 




218 Zermelo 1896a 



gehen braucht. Ebenso entspricht auch einem stetig ausgedehnten Gebiet go 
von Anfangszustanden, ausdriickbar durch Beziehungen der Form: 

F < 0 , 



ein bestimmtes verandertes Gebiet g = g± zur Zeit t und somit auch dem 
iiber go erstreckten nfachen Integrate 



/ 



70 = / d£ 1^2- • -d^n 



das wir als die „ Ausdehnung 11 von go bezeichnen wollen, im allgemeinen eine 
andere Ausdehnung von g 



7 



dx\dx 2 ■ ■ ■ dx n 



In dem besonderen Falle aber, wo die Functionen X M der Bedingung (2) 
genrigen, ist nach dem Satz von Liouville 1 das zweite Integral gleich dem 
ersten und damit von der Zeit unabhangig, wie auch das Gebiet go oder g, 
deren jedes durch das andere bestimmt ist, gewahlt sein moge, sodass man 
abgekiirzt schreiben kann: 



d'y = dx\dx 2 ■ ■ ■ dx n = djo = const. (4) 

„Die Folgezustande, die den Anfangszustanden eines beliebigen Gebietes ent- 
sprechen, erfiillen in jedem Augenblick Gebiete von der gleichen Ausdeh- 
nung. “ 

Ein beliebiges Gebiet go von Zustanden geht also mit der Zeit stetig in 
immer neue Gebiete g = gt, die „Phasen“ seiner Veranderung, liber, welche 
sammtlich die gleiche Ausdehnung 7 besitzen. Alle diese „spateren u Phasen g t 
( t ^ 0) bilden zusammen genommen wieder ein stetiges Gebiet Go, die „Zu- 
kunft 11 von go, cl. h. den Inbegriff aller Zustande, welche kiinftig irgend einmal 
in endlicher Zeit aus solchen von go hervorgehen. Dieses Gebiet G = Go 
wird ganz im endlichen liegen und eine endliche Ausdehnung r ^ 7 besit- 
zen, wenn wir voraussetzen, dass die Grossen x\, X 2 , ■ ■ ■ x n fiir alle Anfangs- 
488 | zustande von go gewisse endliche Grenzen niemals iiberschreiten. Wahrend 

sich nun das Gebiet g von go ausgehend mit der Zeit verandert, zugleich mit 
alien seinen „spateren Phasen“, deren jede immer in die folgende iibergeht, 
so andert sich auch ihre Gesammtheit G wie jedes andere Gebiet und stellt 
dabei in jedem Augenblicke t die „Zukunft“ der entsprechenden Phase g t dar. 
Nach der Definition der Zukunft erfolgt diese Veranderung in der Weise, dass 
immer nur friihere Zustande austreten, niemals neue eintreten konnen: jede 
Phase von G enthalt alle spateren in sich, und die Ausdehnung r kann immer 
nur abnehmen. Da aber nach (4) diese Ausdehnung constant bleiben muss, 

1 Jacobi, Dynamik, p. 93; Kirchhoff, Theorie der Warme, p. 142-144. 




On a theorem of dynamics and the mechanical heat theory 219 



tended area go of initial states, which may be expressed by relations of the 
form: 



F {&,...&)< 0, 

there corresponds a particular altered area g = g t at time t, and hence also 
to the n-fold integral extended over go 

7o = J d^id^ 2 - ■ ■ d£ n , 

which we shall call the “extension” of go, there corresponds in general another 
extension of g 



7 = 



J dx\dx 2 - ■ ■ dx n . 



In the special case, however, where the functions X M satisfy condition 
(2), the second integral is equal to the first one, by Liouville ' s theorem , 3 
and hence independent of time, irrespective of how the area go, or g, each 
of which is determined by the other, has been chosen, so that we may write, 
using abbreviations, 



dy = dx\dx 2 - ■ ■ dx n = dy o = const. (4) 

“The succeeding states corresponding to the initial states of an arbitrary area 
fill areas of the same extension at all instants of time.” 

An arbitrary area go of states is therefore continuously transformed into 
ever new areas g = g t , the “phases” of its change, as time passes, all of which 
have the same extension 7 . Taken together, all these “ later ” phases g t (t > 0) 
in turn form a continuous area Go, the “ future ” of go, i.e. , the epitome of all 
states arising from [states] of go at some arbitrary point in the future over a 
finite period of time. This area G = Go will be completely finite and will have 
a finite extension r > 7 , if we assume that, for all initial states of go, the 
quantities x\, X 2 , ■ ■ • x n never exceed certain finite boundaries. As the area g 
starting with go changes with time, along with all its “later phases”, each of 
which is transformed into its respective successor, so does their totality G, 
just like any other area, and, at every instant t , it represents the “future” 
of the corresponding phase gt- According to the definition of future, this 
change is due always to the disappearance of earlier states but never to the 
appearance of new states: each phase of G contains in itself all later ones, 
and the extension T may decrease only. But since, by (4), this extension 



Jacobi 1866, p. 98; Kirchhojf 1894, P- 142-144. 



3 




220 Zermelo 1896a 



so konnen die austretenden Zustande niemals Gebiete von endlicher Ausdeh- 
nung erfiillen, ihre Anzahl verschwindet gegen die der bleibenden, sodass wir 
sie als singulare bezeichnen konnen. Nun ist go in Go enthalten, also zum 
iiberwiegenden Theil auch in jeder folgenden Phase G T , der Zukunft von g T , 
fiir ein beliebig grosses Zeitintervall r. Das bedeutet aber: es gibt immer Zu- 
stande innerhalb g T , die spater einmal in Zustande von go iibergehen, und, 
ihnen riickwarts entsprechend, Zustande von go, die auch nach Ablaut der 
Zeit r irgend einmal wieder nach go zuriickkehren. Diese letzteren hnden sich 
in alien noch so kleinen Theilen des Gebietes, von denen ja dasselbe wie von 
go selbst gilt, und hangen stetig zusammen, da mit jedem einzelnen zuriick- 
kehrenden Zustancl auch seine nachste Umgebung zuriickkehren muss, d. h. 
sie erfiillen das game Gebiet go mit Ausnahme singularer Zustande von der 
Gesammtausdehnung 0. Schliesst man daher alle diese singularen Zustande 
aus, die zu irgend welchen endlichen Zeiten r gehoren, so verbleibt ein Rest- 
gebiet g' , das nun nicht mehr nothwendig stetig zu sein braucht, aber immer 
noch die iiberwiegende Mehrzahl der Zustande von go umfasst. Diese Zustan- 
de von g' werden nun nach beliebiger Zeit immer noch einmal, also unendlich 
oft nach go zuriickkehren und damit ihren Anfangszustanden beliebig nahe 
kommen, wenn man go geniigend klein angenommen hat. 

Damit ist der Satz von Poincare in voller Ausdehnung bewiesen; fiir 
den vorliegenden Zweck geniigt aber schon der Nachweis, dass die Zustande 
von go im Allgemeinen wenigstens noch einmal nach go zuriickkehren. Schon 
489 hieraus folgt | unmittelbar, dass es keine eindeutige und stetige Function 
S = S (ari, X 2 , ■ ■ ■ x n ) des Zustandes geben kann, die fiir alle Anfangszustdn- 
de eines noch so kleinen Gebietes bestandig zunahme. Denn ware S fiir einen 
Anfangszustand Po wahrend der Zeit r von einem Werthe < R gewachsen auf 
einen anderen > R, so miisste das gleiche gelten von alien Zustanden einer 
gewissen Umgebung g von Po, und fiir die nach g zuriickkehrenden Zustande 
dieses Gebietes miisste die Function nachher wieder abnehmen. 

Dasselbe lasst sich aber auch sehr einfach direct beweisen. Wiirde die 
Function S fiir alle Anfangszustande von g bestandig zunehmen, so wiirde 
sie es auch fiir alle Zustande des grosseren Gebietes G, der Zukunft von g, 
und wegen (4) miisste dann auch das iiber G erstreckte nfache Integral 

J Sdx\dx 2 - ■ • dx n 

bestandig zunehmen. Das ist aber unmoglich, weil sich das Integrationsge- 
biet G immer nur um singulare Zustande ohne endliche Ausdehnung veran- 
dert, wobei der Werth des Integrales constant bleibt. 

Sehr anschaulich wird Bedeutung und Beweis des entwickelten Satzes fiir 
den Fall n = 3, wenn man die Variablen ari, X 2 , X 3 als die Coordinaten 
eines materiellen Punktes im Raume auffasst. Dann bestimmen die Glei- 
chungen (1) in Verbindung mit (2) oder mit (4) eine stationare Stromung 
einer incompressiblen Fliissigkeit und zwar in einem geschlossenen Gefas- 
se, wenn die Grossen x M nicht ins Unendliche wachsen sollen. Einem be- 




On a theorem of dynamics and the mechanical heat theory 221 



must remain constant, the disappearing states can never fill areas of finite 
extension, and their number vanishes with respect to those of the remaining 
states. Thus, we may refer to them as singular states. Now, go is contained 
in Go, and hence, for the most part, also in every succeeding phase G T , the 
future of g T , for an arbitrarily large interval of time r. But this means that 
there are always states within g T that, at some later time, are transformed 
into states of go, and, in reverse correspondence to them, states of go that 
return to go at some point after the expiration of time r. These latter ones 
can be found in even the smallest parts of the area, which, after all, are 
subject to the same conditions as go itself, and continuously hang together 
since for each returning state its immediate neighborhood must return as well. 
In other words, they fill the entire area with the exception of singular states 
of a total extension equal to 0. Hence, if we exclude all these singular states 
belonging to arbitrary finite times r, then what we obtain is a remainder 
area g' that no longer needs to be continuous but still encompasses the great 
majority of states of go- These states of g' will always return to go one more 
time after some arbitrary period of time, and hence indefinitely many times, 
thereby getting arbitrarily close to their initial states, if go is taken sufficiently 
small. 

This concludes the proof of Poincare's theorem in its full generality; but 
for the present purposes it already suffices to show that the states of go 
generally return to go at least one more time. From this alone it imme- 
diately follows that there can be no single-valued and continuous function 
S = S(x\,X 2 , ■ ■ . x n ) of the state that steadily increases for all initial states 
of even the smallest area. For if S had increased from a value < R to a value 
> R for an initial state Po over a period of time r, then the same would have 
to hold true for all states of a certain neighborhood g of Po, and, later on, the 
function would have to decrease again for the states of this area that return 
to g. 

The same can, however, easily be shown by a direct proof. If the function 
S would steadily increase for all initial states g, then it would do so for all 
states of the greater area G, the future of g, and, on account of (4), the n-fold 
integral extended over G 

j Sdx\dx 2 . ■ ■ dx n 

would steadily increase as well. This, however, is impossible since the do- 
main of integration G always changes only by singular states lacking a finite 
extension, where the value of the integral remains constant. 

The significance and proof of the developed theorem becomes very clear 
for the case n = 3 when we conceive of the variables x\, X 2 , x% as the coordi- 
nates of material points in space. In combination with (2) or (4), equations 
(1) then determine a stationary flow of an incompressible fluid, and in par- 
ticular one held in a closed container, if the quantities x p are supposed not 
to grow infinitely large. In this case, to a particular “state” there corresponds 




222 Zermelo 1896a 



stimmten „Zustand“ entspricht hier ein Punkt im Raum, einem in der Zeit 
veranderten Zustande ein in Bewegung begriffener materieller Punkt. Die 
von diesen Fliissigkeits-Punkten beschriebenen Bahnen, die „Stromlinien“, 
bilden in stetiger Zusammensetzung „Stromrohren“ oder „Stromfaden“, je 
nachdem sie von geschlossenen Curven oder von Flachenstiicken ausgehen, 
und bleiben bei der stationaren Bewegung immer unverandert. Nun lehrt 
die Anschauung, dass hier alle Stromfaden in sich selbst zurucklaufen miis- 
sen, weil die durchstromende Fliissigkeit weder die Rohren durchbrechen, 
noch sich im Inneren irgendwo ansammeln kann. Daraus folgt aber, dass 
490 jedes endliche Fliissigkeitstheilchen einem einmal ange- | nommenen Orte im- 
mer wieder so nahe kommen muss, als man will, wenn man nur die Fliis- 
sigkeitsfaden diinn genug annimmt und gemigende Zeit zur Verfiigung hat. 
Daneben gibt es freilich auch nicht zuruckkehrende singulare Stromlinien, 
z. B. solche, die sich umstromten eingeschlossenen festen Korpern oder Hohl- 
raumen zwischen den nach den verschiedenen Seiten ausweichenden iibri- 
gen Stromlinien asymptotisch ndhern ; diese vermogen aber niemals Strom- 
faden von endlicher Dicke zu bilden. Sollte dagegen die Stromung ein Ge- 
schwindigkeitspotential besitzen, so miisste dasselbe in dem vollstandig ge- 
schlossenen Gefasse nothwendig mehrdeutig sein, wahrend von der Functi- 
on S in unserer Betrachtung ausdriicklich Eindeutigkeit gefordert wurde. 
Auch in dem allgemeineren Falle n > 3 ist es bei der weitgehenden Ana- 
logic oft von heuristischem Werth, die gleiche Ausdrucksweise beizubehalten 
und die Gleichungen (1) und (2) oder (4) als die einer „stationaren Stro- 
mung einer incompressiblen Fliissigkeit in einem Raume von n Dimensionen 11 
zu deuten. 

Das Ergebniss unserer Betrachtung ist also das folgende: 

In einem System beliebig vieler materieller Punkte, deren Beschleunigun- 
gen nur von ihrer Lage im Raum abhangen, gibt es keine „irreversiblen“ Vor- 
gange fur alle Anfangszustande, die ein noch so Heines Gebiet von endlicher 
Ausdehnung erfiillen, falls sowohl die Coordinaten als die Geschwindigkeiten 
der Punkte endliche Grenzen niemals uberschreiten. 

Der Satz gilt aber auch allgemeiner, insbesondere fiir ein beliebiges mecha- 
nisches System mit den verallgemeinerten Coordinaten q M und ihren Bewe- 
gungsmomenten p dessen Bewegungsgleichungen sich in der Hamilton’schen 
Form schreiben lassen: 



dp^ _ dH_ dq M _ OH 
dt dq M ’ dt dp^ ’ 

und das wir als ein „conservatives“ bezeichnen konnen, weil hier alle Krafte 
ein Potential besitzen und daher die mechanische Energie erhalten bleibt. In 
einem solchen System namlich ist offenbar immer 

9 dp M d dq M = 

d p^dt dq dt 




On a theorem of dynamics and the mechanical heat theory 223 



a point in space, and to a state that changes in time a material point in 
motion. The paths taken by these fluid-points, the “streamlines”, form either 
“stream tubes” or “stream filaments” as they are continuously compounded, 
depending on whether they issue forth from closed curves or from plane ar- 
eas, and they always remain unaltered at stationary motion. Intuition tells us 
that, in this case, all stream filaments must run back in themselves since the 
passing fluid can neither break through the tubes nor accumulate somewhere 
in the interior. From this, however, it follows that, time and again, every 
finite particle of the fluid has to get arbitrarily close to any position once oc- 
cupied by it, provided only that the fluid filaments are taken sufficiently thin 
and that sufficient time is available. There are of course also non-recurring 
singular streamlines, such as those asymptotically approximating solid bodies 
embedded in the flow or hollow spaces between the other streamlines, which 
are diverted in different directions; but those can never form stream filaments 
of finite thickness. In contrast, if the flow possesses a velocity potential, then 
the latter would necessarily have to be many-valued in the completely closed 
container, while, in our considerations, we expressly demanded that the func- 
tion S be single-valued. In the more general and largely analogous case n > 3, 
too, it is of heuristic value to retain the same terminology and to interpret 
equations (1) and (2) or (4) as those of a “stationary flow of an incompressible 
fluid in a space of n dimensions”. 

Our considerations lead to the following result: 

In a system of arbitrarily many material points whose acceleration only 
depends on their location in space there are no “irreversible” processes for all 
initial states that fill an arbitrarily small area of finite extension, if both the 
coordinates and the velocities of the points never exceed finite boundaries. 

The theorem is, however, also valid in a more general form, and in partic- 
ular for an arbitrary mechanical system with the generalized coordinates q ^ 
and their momenta of motion whose equations of motion can be written 
in Hamiltonian form: 



dp^ _ dH_ dq M _ dH 
dt d ’ dt dp,j, ’ 

and which we may call a “conservative” system since, in this case, all forces 
possess a potential, and hence the mechanical energy is preserved. For in such 
a system we apparently always have 

d dp ^ d dq^ = 

d p IL dt dq dt 




224 Zermelo 1896a 



491 | und mit dem Analogon der Beziehung (2) iniissen auch alle aus ihr fliessen- 
den Folgerungen ihre Giiltigkeit behalten. 

Nach der mechanischen Theorie in ihrer gewohnlichen atomistischen Dar- 
stellung ware nun die ganze Natur als ein System der betrachteten Art auf- 
zufassen: alle Naturvorgange sind nichts als Bewegungen der Atome oder 
Moleciile, die entweder selbst als ausdehnungslose Punkte oder als Aggregate 
solcher Punkte behandelt werden konnen und ausschliesslich „Centralkraf- 
ten“, die ein Potential haben, und von den Geschwindigkeiten unabhangig 
sind, unterliegen. Eben diese Annahme sucht man in der „kinetischen Gas- 
theorie“ durchzufiihren, indem man die Molecule eines „vollkommenen Gases 11 
als abstossende Centren, als elastische Kugeln oder mit Boltzmann als elasti- 
sche feste Korper anderer Gestalt, jedenfalls aber als conservative 11 Systeme 
in dem angegebenen Sinne betrachtet, nur class man sich hier bei der Wirkung 
zweier Molecule aufeinander auf „Stosskrafte“ beschrankt, d. h. auf Abstos- 
sungen, die erst bei sehr grosser gegenseitiger Annaherung wirksam werden. 

Unter diesen Voraussetzungen konnten also auf Grund der vorhergehen- 
den Betrachtungen „ irreversible 11 Vorgange fiir allgemeinere Anfangszustande 
nur dadurch moglich werden, dass, von einer gleichformig fortschreitenden 
Bewegung des Gesammtschwerpunktes natiirlicli abgesehen, Molecule sich 
ins Unendliche zerstreuen oder schliesslich unendlich grosse Geschwindigkei- 
ten gewinnen. 1st aber das erstere durch die besondere Natur des Systems, 
das wir uns z. B. von einer festen Hiille umgeben denken konnen, ausgeschlos- 
sen, so ist es auch das letztere auf Grund des Princips von der Energie. Denn 
sonst miisste zur Erreichung einer unendlich grossen lebendigen Kraft erst 
eine unendlich grosse Arbeit geleistet werden, was nur bei unbegrenzter An- 
naherung zweier anziehenden Centren eintreten konnte, wahrend wir doch 
nach unserer Erfahrung bei sehr grosser Annaherung schlechterdings keine 
anderen als abstossende Krafte voraussetzen diirfen. Haben wir z.B. ein in 
ein festes Gefass mit elastischen und fiir Warme undurchdringlichen Wan- 
den eingeschlossenes Gas, so gabe es zwar im allgemeinen eine unendliche 
Mannigfaltigkeit von Anfangszustanden der Molecule, fiir welche das Gas 

492 bleibenden | Zustandsanderungen, wie Reibung, Warmeleitung oder Diffusi- 
on entgegenginge. Aber daneben gabe es noch sehr viel mehr von vornherein 
ebenso mogliche Anfangszustande, wie man sie schon durch beliebig kleine 
Verriickungen eines Moleciils aus den friiheren erhalten konnte, fiir welche 
anstatt solcher irreversiblen Processe alle Zustande sich mit beliebig kleinen 
Abanderungen in dem oben angegebenen Sinne periodisch wiederholten. Das 
miisste auch gelten, wenn etwa der auf unsere Sinne wirkende physikalische 
Zustand, z. B. die Temperatur, und mit ihm auch der Werth der Entropie, 
nicht durch den augenblicklichen Bewegungszustand definirt ware, sondern 
erst durch eine endliche Folge von Bewegungen , die aber jedenfalls durch den 
anfanglichen Bewegungszustand bestimmt. ware und mit ihm iminer wieder- 
kehren miisste. 

Um daher die allgemeine Giiltigkeit des zweiten Hauptsatzes festzuhal- 
ten, ware man zu der Annahme genothigt, dass trotz ihrer geringeren Anzahl 




On a theorem of dynamics and the mechanical heat theory 225 



and, given the analogue of the relation (2), all consequences following from 
it must retain their validity as well. 

The common, atomistic account of the mechanical theory suggests that 
we view nature in its entirety as a system of the sort under consideration: all 
processes in nature are but motions of atoms or molecules, which themselves 
can be treated either as points without extension or as aggregates of such 
points and which are subject only to “central forces” that have a potential 
and are independent of the velocities. It is this assumption that one seeks 
to make in the “kinetic theory of gases” by considering the molecules of a 
“perfect gas” as repulsive centers, elastic spheres, or, with Boltzmann, elastic 
solid bodies of a different form, but always as “conservative” systems in the 
sense under consideration, except only that the action of any two molecules 
upon one another is limited to “impact forces”, i.e. , to repulsions which take 
effect only at very great mutual approximation. 

According to the previous considerations, “irreversible” processes for more 
general initial states could possibly occur under these conditions only if 
molecules disperse into the infinite or eventually move at infinitely large ve- 
locities, except, of course, for the uniform motion of the total center of mass. 
But if the first possibility is excluded by the special nature of the system, 
which can be thought of, e.g., as being enclosed in a solid casing, then so is 
the latter possibility by virtue of the principle [of the conservation] of en- 
ergy. For, otherwise, reaching an infinitely great vivid force would require the 
performance of an infinitely large amount of work, which could only occur 
in the case of the unlimited approximation of two attracting centers, while 
our experience teaches us to assume no other than repulsive forces in cases of 
very great approximation. Consider, e.g., a gas enclosed in a solid container 
whose walls are elastic and impenetrable to heat. In this case, there would in 
general be an infinite manifold of initial states of molecules in which the gas 
would resist permanent changes of state such as friction, heat conduction and 
diffusion. But there would also be many more initial states equally possible 
from the outset, such as those obtainable from earlier ones through arbitrar- 
ily small displacements of a molecule, for which, instead of such irreversible 
processes, all states would periodically repeat themselves in the sense speci- 
fied above with arbitrarily small variations. This would also have to be true 
if, e.g., the physical state affecting our senses, such as the temperature and, 
along with it, the value of the entropy, were not defined by the current state 
of motion but only by a finite sequence of motions, which, however, would at 
least be determined by the initial state of motion and would always have to 
recur together with it. 

In order to retain the general validity of the second law, we therefore would 
have to assume that just those initial states leading to irreversible processes 




226 



Zermelo 1896a 



gerade jene zu irreversiblen Vorgangen fuhrenden Anfangszustande in der Na- 
tur einmal verwirklicht seien, wahrend die anderen, mathematisch betrachtet, 
wahrscheinlicheren thatsachlich nicht vorkamen. 

So unwiderleglich eine solche Annahme auch ware, so wenig entsprache 
sie unserem Causalitatsbediirfniss und jedenfalls dem Geiste der mechani- 
schen Naturbetrachtung selbst, der uns immer nothigen wird, alle denkbaren 
mechanischen Anfangszustande, wenigstens innerhalb gewisser Grenzen auch 
als physikalisch moglich vorauszusetzen, zumal solche, die eine iiberwiegen- 
de Mehrheit ausmachen und von wirklich vorkommenden nur beliebig wenig 
abweichen. Beziehen sich doch, streng genommen, alle unsere Naturgesetze 
nicht auf bestimmte Grossen oder Vorgange, die sich genau ja niemals be- 
obachten lassen, sondern immer nur auf gewisse Spielraume, Annaherungen 
und Wahrscheinlichkeiten, wahrend Singularitaten ausschliesslich als Grenz- 
falle in der Abstraction existiren. Die hier erorterte Annahme stande also 
einzig da in der Physik, und ich glaube daher nicht, dass sie irgend jemand 
wiirde dauernd befriedigen konnen. 

Dass nicht alle denkbaren Anfangszustande dem zweiten Hauptsatz ent- 
493 sprechen konnen, geht schon daraus hervor, | dass bei einer Umkehrung der 
Geschwindigkeitsrichtungen aller Molecule zu einem beliebigen Zeitpunkt sich 
auch der ganze zeitliche Verlauf eines Vorganges umkehren miisste. In der 
That ist auch dieses Bedenken schon langst gegen die mechanische Ablei- 
tung irreversibler Processe geltend gemacht worden und hat noch im Winter 
1894/95, angeregt durch eine Aeusserung CulverwelV s, zu einer ausgedehnten 
Discussion dieser Fragen in der „Nature u Veranlassung gegeben, ohne indess, 
wie mir scheint, zu einer befriedigenden Losung gefiihrt zu haben. Es liess 
sich eben nicht beweisen, dass der plrysikalische Zustand eines Gases, auf den 
es allein ankommt, fur gleiche und entgegengesetzte Geschwindigkeiten aller 
Molecule immer derselbe sein miisse, in welchem Falle allein hier von einer 
wirklichen Umkehrung des Vorganges gesprochen werden diirfte, und es blieb 
ferner noch die Moglichkeit offen, dass wenigstens fiir ein ausgedehntes Ge- 
biet von Anfangszustanden bestandige Vermehrung der Entropie stattfinden 
konne. Beides sind Einwande gegen die angegebene Argumentation, die erst 
durch die Anwendung des Poincare’ schen Satzes beseitigt werden. 

Nach alledem bestande also die Nothwendigkeit, entweder dem Carnot- 
Clausius’ schen Princip oder aber der mechanischen Grundansicht eine princi- 
piell andere Fassung zu geben, sofern man sich immer noch nicht entschliessen 
kann, die letztere iiberhaupt endlich aufzugeben. Geringere Abanderungen 
wiirden hier, wie mir scheint, kaum zum Ziele fiihren. Wollte man beispiels- 
weise versuchen, die zwischen den Moleciilen oder Atomen wirkenden Krafte 
statt allein von ihrer gegenseitigen Lage auch von ihren Geschwindigkeiten 
abhangig zu machen, womit allerdings die Anwendbarkeit unseres Satzes ver- 
mieden wiirde, so miisste man, um nicht gegen das Princip der Energie zu 
verstossen, Zusatzkrafte einfiihren, deren Arbeit bestandig verschwindet, de- 
ren Richtung also durch die Geschwindigkeiten mit bestimmt wird. Dann aber 
konnten die Krafte nicht mehr unablrangig voneinander nach Wirkung und 




On a theorem of dynamics and the mechanical heat theory 227 



are realized in nature, their small number notwithstanding, while the other 
ones, whose probability of existence is higher, mathematically speaking, do 
not actually occur. 

While such an assumption would certainly be irrefutable, it would hardly 
accord with our need for causality or even with the spirit of the mechanical 
approach to nature itself, which will always compel us, at least within certain 
limits, to consider all those mechanical initial states as physically possible that 
are conceivable, and in particular those that constitute a great majority and 
deviate from those actually occurring in reality by an arbitrarily small amount 
only. For, strictly speaking, all of our laws of nature refer not to specific 
magnitudes or processes, which, after all, defy precise observation, but always 
only to certain margins, approximations and probabilities, while singularities 
only exist as limiting cases in abstraction. The assumption discussed here 
would thus be unique in physics, and I therefore doubt that anyone could 
find lasting satisfaction in it. 

That not every conceivable initial state is capable of satisfying the second 
law already follows from the fact that the reversal of the directions of the 
velocities of all molecules at an arbitrary instant of time would necessarily 
lead to a reversal of the entire time evolution of a process. In fact, this con- 
cern has already been raised against the mechanical derivation of irreversible 
processes a long time ago, and, as recently as in the winter of 1894/95, has 
led to an extensive discussion of these questions in Nature, which was insti- 
gated by a comment by Culverwell but which, as far as I can see, failed to 
reach a satisfactory resolution. It was simply not possible to prove that the 
physical state of a gas, which is all that counts, must always be the same for 
identical and opposing velocities of all molecules, which is the only case in 
which it would be legitimate to speak of an actual reversal of the process. 
Furthermore, the possibility was not excluded that at least for some extended 
area of initial states there could be a constant increase of entropy. Both are 
objections to the line of argument under discussion and can only be removed 
by applying Poincare' s theorem. 

All of this seems to suggest that it is imperative to provide an altogether 
different version of either the Carnot- Clausius principle or of the basic me- 
chanical approach, unless one decides to entirely abandon the latter at last. 
I do not believe that minor modifications would be effectual. For instance, 
if one were to try to make the forces acting between molecules or atoms de- 
pendent also on their velocities in addition to their relative locations, which 
would, however, avoid the applicability of our theorem, then, so as not to 
violate the principle of [the conservation of ] energy, it would be necessary to 
introduce additional forces whose work constantly vanishes, and hence whose 
direction is also determined by the velocities. But then, the forces could no 




228 



Boltzmann 1896 



Gegenwirkung von Punkt zu Punkt wirken, wie doch der ganzen Atomtheorie 
wesentlich ist. 

Aber mag es auch gelingen, durch geeignete Abanderung der Vorausset- 
494 zungen, z. B. unter Zugrundelegung der Hertz’- 1 schen „Principien der Mecha- 
nik" 1 , dem dargelegten Widerspruche zu entgehen, so ist es doch jedenfalls 
unmoglich, auf Grund der bisherigen Theorie ohne Specialisirung der An- 
fangszustande eine mechanische Ableitung des zweiten Hauptsatzes durchzu- 
fiihren, und es ist ebenso unmoglich, unter den gleichen Voraussetzungen das 
bekannte Gesetz der Geschwindigkeitsvertheilung unter den Gasmoleciilen, 
wie seine Entdecker Maxwell und Boltzmann wollten, als den nach einiger 
Zeit sich regelmassig einstellenden stationaren Endzustand zu erweisen. Von 
einer eingehenden Priifung der bisherigen Versuche einer solchen Ableitung 
im einzelnen, namentlich der von Boltzmann und Lorentz (in den Berichten 
der Wiener Akademie) 2 , habe ich bei der Schwierigkeit des Gegenstandes 
vorlaufig Abstand genommen, um lieber mit moglichster Klarheit darzule- 
gen, was mir hier als streng beweisbar und principiell wichtig erscheint, und 
dadurch zu einer erneuten Erorterung und schliesslichen Losung der vorlie- 
genden Frage beizutragen. 

Berlin, im December 1895. 



Entgegnung auf die warmetheoretischen 
Betrachtungen des Hrn. E. Zermelo 

Boltzmann 1896 



Schon Clausius , Maxwell u. A. haben wiederholt darauf hingewiesen, dass 
die Lehrsatze der Gastheorie den Charakter statistischer Wahrheiten haben. 
Ich habe besonders oft und so deutlich als mir moglich war betont 1 , dass 
das Maxwell’sche Gesetz der Geschwindigkeitsvertheilung unter Gasmolecii- 



1 Die v. Helmholtz ’ sche Theorie der „cyclischen Systeme“ in ihrer urspriinglichen 
Form dagegen wiirde von den Folgerungen des Poincare ’ schen Satzes mit betrof- 
fen werden, da sie in letzter Linie gleichfalls, wenn auch in anderer Form, auf die 
Hamilton ’ schen Gleichungen zuriickgeht. 

2 Neuerdings zusammengestellt in Boltzmann’s „Vorlesungen fiber Gastheorie" 1. 
1896. 

1 L. Boltzmann, Wien. Sitzungsber. II 75 . p. 67. 1877; 76 . p. 373. 1877; 78 . p. 740. 
1878. „Der zweite Hauptsatz der Warmetheorie", Rede gehalten am 29. Mai 1886, 
Almanach d. Wien. Akad. Nature 51 . p. 413, 28. Febr. 1895; Vorlesung fiber 
Gastheorie p. 42. 1895, Leipzig bei J. A. Barth. 




Rejoinder to the heat-theoretic considerations of Mr. E. Zermelo 



229 



longer act independently of one another in action and reaction from point to 
point, as is essential to the entire atomistic theory. 

But even if it is possible to escape the contradiction considered here by 
making suitable changes to the underlying assumptions, e.g., by building on 
H. Hertz 1894, 4 it is certainly impossible to carry out a mechanical derivation 
of the second law on the basis of the existing theory without specializing the 
initial states, and it is equally impossible to show, under the same assump- 
tions, the well-known law of velocity distribution among gas molecules to be 
the stationary final state regularly reached after some time, as its discover- 
ers, Maxwell and Boltzmann intended. Faced with the difficulty of the subject 
matter, I have for now refrained from a thorough review of the previous at- 
tempts at such a derivation, in particular the one undertaken by Boltzmann 
and Lorentz (published in the Berichte der Wiener Akademie ). Instead, I 
chose to present here as clearly as possible what I consider strictly provable 
and essentially important in order to contribute to a renewed discussion of 
the question at hand and to its eventual resolution. 

Berlin, December 1895. 



Rejoinder to the heat-theoretic 
considerations of Mr. E. Zermelo 

Boltzmann 1896 



[The introductory note just before 1896a also addresses Boltzmann 1896.} 

As Clausius and Maxwell , among others, have already repeatedly pointed 
out, the principles of the theory of gases have the character of statistical 
truths. I have pointed out particularly often, and as clearly as I possibly 
could, 1 that Maxwell's law of velocity distribution among gas molecules cer- 

4 By contrast, the original version of v. Helmholtz's theory of “cyclical systems” 
would also be affected by the consequences of Poincare’s theorem since it, too, 
albeit in a different form, is, in the end, a descendant of the Hamiltonian equa- 
tions. 

5 Recently assembled in Boltzmann 1896a. 

1 Boltzmann 1877a, p. 67; Boltzmann 1877b, p. 373; Boltzmann 1878, p. 740; “Der 
zweite Hauptsatz der Warmetheorie”, lecture delivered on May 29, 1886, Boltz- 
mann 1886, Boltzmann 1895b, p. 413; Boltzmann 1896a, p. 42. 




230 



Boltzmann 1896 



len keineswegs wie ein Lehrsatz der gewohnlichen Mechanik aus den Bewe- 
gungsgleichungen allein bewiesen werden kann, dass man vielmehr nur bewei- 
sen kann, dass dasselbe weitaus die grosste Wahrscheinlichkeit hat und bei 
einer grossen Anzahl von Moleciilen alle iibrigen Zustande damit verglichen 
so unwahrscheinlich sind, dass sie praktisch nicht in Betracht kommen. An 
derselben Stelle habe ich auch betont, dass der zweite Hauptsatz vom mole- 
culartheoretischen Standpunkte ein blosser Wahrscheinlichkeitssatz ist. Die 
Abhandlung des Hrn. Zermelo „Ueber einen Satz der Dynamik und die me- 
chanische Warmetheorie“ 2 zeigt nun zwar, dass meine betreffenden Arbeiten 
trotzdem nicht verstanden worden sind, demungeachtet muss ich mich iiber 
diese Abhandlung freuen, als iiber den ersten Beweis, dass diesen Arbeiten 
in Deutschland iiber haupt Aufmerksamkeit geschenkt wird. 

Der von Hrn. Zermelo zu Anfang auseinander gesetzte Satz Poincare' s ist 
selbstverstandlich richtig, aber dessen Anwendung auf die Warmetheorie ist 
es nicht. 

Ich habe den Beweis des Maxwell' schen Geschwindigkeitsvertheilungs- 
gesetzes aus dem Satze abgeleitet, dass nach den Wahrscheinlichkeitsgeset- 
zen eine gewisse Grosse H (gewissermaassen das Maass des Grades der Ab- 
weichung des herrschenden Zustandes vom Maxwell’ schen) fiir ein in einem 
774 | ruhenden Gefasse ruhendes Gas nur abnehmen kann. Die Art und Weise 

dieser Abnahme wird am besten klar, wenn man sich, wie ich es 1 that, die 
Zeit als Abscisse und die dazu gehorigen Werthe der Grosse H, vermindert 
um deren kleinsten Werth H min , als Ordinate aufgetragen denkt, wodurch 
man die sogenannte H- Curve erhalt. 

Setzt man, wie es bei dem in meiner Gastheorie § 5 auseinandergesetzten 
Beweise ausdriicklich geschieht, zuerst die Anzahl der Gasmoleciile unendlich 
und lasst erst nachlrer die Zeit der Bewegung sehr gross werden, so erhalt 
man in der weitaus iiberwiegenden Mehrheit der Falle eine Curve, welche 
sich asymptotisch der Abscissenaxe nahert. Dann ist auch, wie man leicht 
sieht, der Poincare’sehe Satz nicht anwendbar. 

Nimmt man aber die Zeit der Bewegung unendlich gross, dagegen die An- 
zahl der Molecule zwar sehr, aber nicht absolut unendlich gross an, so hat die 
iJ-Curve einen anderen Charakter. Sie verlauft, wie ich schon am citirten Or- 
te in der Nature zeigte, fast immer sehr nahe der Abscissenaxe. Nur ausserst 
selten erhebt sie sich hoher iiber dieselbe, was wir einen Bucket nennen wollen, 
und zwar nimmt die Wahrscheinlichkeit eines Buckels mit wachsender Hohe 
desselben rapid ab. Fiir jede Zeit, fiir welche die Ordinate der H- Curve sehr 
klein ist, herrscht fast genau die Maxwell’ sche Geschwindigkeitsvertheilung; 
bedeutende Abweichung von derselben aber hnden an den hohen Buckeln 
der P[- Curve statt. Hr. Zermelo glaubt nun aus dem Poincare’ schen Satze 
schliessen zu konnen, dass sich das Gas nur bei gewissen singularen Anfangs- 
zustanden, deren Anzahl unendlich klein ist gegen die aller moglichen An- 



2 Zermelo, Wied. Ann. 57. p. 485. 1896. 
1 L. Boltzmann, Nature 1. c. 




Rejoinder to the heat-theoretic considerations of Mr. E. Zermelo 



231 



tainly cannot be proved from the equations of motion alone like a principle of 
ordinary mechanics and that, in fact, it is only possible to prove that it has 
by far the highest probability and that, given a large number of molecules, 
all other states are so improbable in comparison that they are practically 
irrelevant. At the same time, I also stressed that the second law is but a 
principle of probability theory as far as the molecular-theoretic point of view 
is concerned. Although Zermelo ’ s essay “Ueber einen Satz der Dynamik und 
die mechanische Warmetheorie” 2 shows that my works on these issues are 
still not understood, I cannot but take delight in it for this essay consti- 
tutes the first proof that these works of mine receive any attention at all in 
Germany. 

While the theorem by Poincare that Zermelo discusses in the beginning 
of his paper is of course correct, its application to heat theory is not. 

I derived the proof of Maxwell ’ s law of velocity distribution from the 
theorem which states that, according to the laws of probability theory, a 
certain magnitude H (the measure of the degree of deviation of the prevailing 
state from the Maxwell state, as it were) can only decrease for a gas at 
rest in a container at rest. The way in which this decrease occurs can best 
be understood by plotting time along the abscissa axis, as I did, 3 and the 
corresponding values of the magnitude H, decreased by their least value H min , 
along the ordinate axis, which yields the so-called H- curve. 

If we first take the number of gas molecules to be infinite, as was clearly 
done in the proof discussed in §5 of my 1896a , and only then let the time of 
the motion grow very large, then, in the vast majority of cases, we obtain a 
curve asymptotically approximating the abscissa axis. Moreover, as can easily 
be seen, Poincare’s theorem is not applicable in this case. 

If, however, we take the time of the motion to be infinitely great and, 
in contrast, the number of molecules to be very great but not absolutely 
infinite, then the 17-curve has a different character. As I have already shown 
in the contribution to Nature quoted above, it almost always runs very close 
to the abscissa axis. It is only in the rarest cases that it rises higher above 
it, which we shall call a “hump”. In particular, the probability of a hump 
rapidly decreases as its height increases. For every moment in time for which 
the ordinate of the H - curve is very small, the Maxwell velocity distribution 
obtains almost exactly; significant deviations from it occur, however, at high 
humps of the H- curves. Now, Mr. Zermelo believes that he can conclude from 
Poincare’s theorem that it is only for certain singular initial states whose 
number is infinitesimal compared to that of all possible initial states that 



2 Zermelo 1896a. 

3 Boltzmann 1895b, 1. c. 




232 



Boltzmann 1896 



fangszustande, clem Maxwell'schen Geschwindigkeitsvertheilungsgesetze im- 
mer mehr nahert, wahrend bei den meisten Anfangszustanden dieses Gesetz 
nicht Platz greift. Dies scheint mir nicht richtig zu sein. Gerade fiir gewisse 
singulare Anfangszustande tritt das Maxwell' sche Geschwindigkeitsverthei- 
lungsgesetz niemals ein, z. B. wenn alle Moleciile anfangs in einer an beiden 
Enden auf der Gefasswand senkrechten Geraden lagen. In der weitaus (un- 
endlich) iiberwiegenden Mehrzahl von Anfangsbedingungen dagegen hat die 
If -Curve den soeben geschilderten Charakter. 

775 | Liegt der Anfangszustand des Gases auf einem enorm hohen Buckel, d. h. 
weicht er ganzlich von der Maxwell' schen Geschwindigkeitsvertheilung ab, so 
wird sich der Zustand mit enormer Wahrscheinlichkeit dieser Geschwindig- 
keitsvertheilung nahern unci wahrend enorm langer Zeit nur verschwindencl 
wenig davon abweichen. Allerdings kann man, wenn die Zeit der Bewegung 
noch mehr verlangert wird, wieder zu einem grosseren Buckel der if -Curve 
gelangen, ja, wenn diese Verlangerung nur geniigend fortgesetzt wird (also 
selbstverstandlich fiir in mathematischem Sinn unendlich lange Bewegungs- 
dauer unendlich oft), muss sogar der alte Zustand wieder kehren. 

Hr. Zermelo hat daher vollstandig recht, wenn er behauptet, dass die Be- 
wegung im mathematischen Sinne eine periodische ist, aber, weit entfernt 
meine Satze zu widerlegen, ist diese Periodicitat vielmehr in vollster Harmo- 
nie mit denselben. Man vergesse nicht, dass die Maxwell'sche Geschwindig- 
keitsvertheilung kein Zustand ist, wobei jedem Moleciil ein bestimmter Ort 
und eine bestimmte Geschwindigkeit angewiesen wird und welcher etwa da- 
durch erreicht wird, dass sich der Ort und die Geschwindigkeit jedes Moleciils 
diesem bestimmten Orte und dieser bestimmten Geschwindigkeit assympto- 
tisch nahern. Unter einer endlichen Zahl von Moleciilen kann iiberhaupt nie- 
mals exact, sondern nur mit grosser Annaherung die Maxwell' sche Geschwin- 
digkeitsvertheilung bestehen. Diese ist keineswegs eine ausgezeichnete sin- 
gulare Geschwindigkeitsvertheilung, welcher unendlich vielmal mehr Nicht- 
MaxwelV sche Geschwindigkeitsvertheilungen gegeniiber stehen; sondern sie 
ist im Gegentheile dadurch charakterisirt, dass die weitaus grosste Zahl der 
iiberhaupt moglichen Geschwindigkeitsvertheilungen die charakteristischen 
Eigenschaften der Maxwell' schen haben und gegeniiber dieser Zahl die An- 
zahl clerjenigen moglichen Geschwindigkeitsvertheilungen, welche bedeutend 
von der Maxwell' schen abweichen, verschwindend klein ist. Wahrend daher 
Hr. Zermelo sagt, die Anzahl derjenigen Zustande, welche schliesslich zum 
Maxwell' schen fiihren, sei verschwindend gegeniiber der aller moglichen Zu- 
stande, so behaupte ich dagegen, dass iiberhaupt die weitaus grosste Zahl der 
gleich moglichen Zustande „ Maxwell' sche 11 sind und dagegen die Zahl der we- 

776 sentlich von der Maxwell' schen | Geschwindigkeitsvertheilung abweichenden 
nur verschwindend klein ist. 1 



1 Ueber das, was hierbei unter gleich moglichen Zustanden zu verstehen ist, vgl. 

meine eingangs citirten Abhandlungen. 




Rejoinder to the heat-theoretic considerations of Mr. E. Zermelo 



233 



the gas comes ever closer to satisfying Maxwell’ s law of velocity distribution, 
while this law does not apply to most initial states. This does not seem right 
to me. It is particularly for certain singular initial states that Maxwell’s law 
of velocity distribution never holds, such as when all molecules were at first 
lying in a straight line perpendicular at both ends to the wall of the container. 
By contrast, for the vast (infinite) majority of initial conditions, the H - curve 
has the character just described. 

If the initial state of the gas lies on an enormously high hump, i.e. , if 
it entirely deviates from the Maxwell velocity distribution, then the state 
will approximate this velocity distribution with enormously high probability 
and deviate from it only by a vanishingly small amount over an enormously 
long period of time. It is, however, possible to reach again a greater hump 
of the .fZ-curve by further extending the time of motion. In fact, it is even 
the case that the original state must return, provided only that we continue 
to sufficiently extend the time of motion (that is, of course, infinitely many 
times for a duration of motion that is infinitely long in the mathematical 
sense of the word). 

Mr. Zermelo is therefore right in claiming that, mathematically speaking, 
the motion is periodic. He has by no means succeeded, however, in refuting 
my theorems, which, in fact, are entirely consistent with this periodicity. One 
should not forget that Maxwell’s velocity distribution is not a state where 
a specific location and a specific velocity are assigned to each molecule and 
which is reached when, say, the location and velocity of each molecule asymp- 
totically approximate this specific location and velocity. Generally, among a 
finite number of molecules, the Maxwell velocity distribution can never obtain 
exactly but only with great approximation. It is by no means a distinguished, 
singular velocity distribution, which is pitted against infinitely many more 
non -Maxwell velocity distributions. Quite the contrary. It is characterized by 
the fact that the by far greatest number of all possible velocity distributions 
have the characteristic properties of the Maxwell distribution and that, in 
comparison, the number of possible velocity distributions significantly devi- 
ating from the Maxwell distribution is vanishingly small. Thus, contrary to 
Mr. Zermelo’ s assertion that the number of states eventually leading to the 
Maxwell distribution vanishes compared to that of all possible states, I hold 
that the by far greatest number of equally possible states are “ Maxwell” and 
that, by contrast, the number of states essentially deviating from the Maxwell 
velocity distribution is only vanishingly small . 4 



4 As for what I mean by equally possible states, see the works referred to at the 
beginning. 




234 



Boltzmann 1896 



Fur das erste Moleciil ist jeder Ort im Raume und fur Geschwindigkeits- 
componente dessen erste jede mit dem Energieprincipe vertragliche Grosse 
gleich wahrscheinlich. 

Combinirt man aber alle Zustande aller Molecule, so erhalt man in den 
weitaus meisten Fallen mit grosser Annaherung das Maxwell’ sche Geschwin- 
digkeitsvertheilungsgesetz. Nur ganz wenige Combinationen geben eine total 
davon abweichende Zustandsvertheilung. 

Ein Analogon hierfiir bietet die Theorie der Methode der kleinsten Qua- 
drate, wo fiir jeden Elementarfehler ein positiver oder ein gleicher negativer 
Werth als gleich wahrscheinlich angenommen und dann bewiesen wird, class 
wenn man alle moglichen Werthe der Elementarfehler in alien moglichen Wei- 
sen combinirt, bei der grossten Mehrzahl der Combinationen das Gawss’sche 
Fehlergesetz herauskommt und nur bei verhaltnissmassig verschwindend we- 
nigen Combinationen bedeutende Abweichungen davon eintreten, welche also 
nicht unmoglich, aber unendlich unwahrscheinlich sind. 

Ein noch einfacheres Beispiel bietet das Wcirfelspiel. Bei 6000 Wiirfen 
mit demselben Wiirfel wird man beilaufig 1000 Einser-, 1000 Zweierwiirfe 
etc. machen; aber keineswegs deshalb, weil die gerade zufallig eingetretene 
Reihenfolge der Wiirfe wahrscheinlicher ware, als eine Reihe von 6000 Ein- 
serwiirfen, sondern bloss, weil weit mehr mogliche Combinationen auf eine 
nahe gleiche Zahl von Einserwiirfen, Zweierwiirfen etc., als auf lauter Einser- 
wiirfe fiihren. 

Die Wahrscheinlichkeitsrechnung fiihrt daher, wie langst bekannt, eben- 
falls zu dem Resultate, class eine Wiederkehr des urspriinglichen Zustandes 
durchaus nicht mathematisch ausgeschlossen ist, ja dass dieselbe sogar zu er- 
warten ist, wenn die Zeit der Bewegung geniigend lange ausgedehnt wird, cla 
die Wahrscheinlichkeit eines dem Anfangszustande sehr nahe liegenden Zu- 
standes sehr klein, aber nicht unendlich klein ist. Die Consequenz cles Poin- 
777 care’schen Satzes, dass | abgesehen von wenigen singularen Zustandsverthei- 
lungen ein dem Anfangszustande sehr naher Zustand nach einer, wenn auch 
sehr langen Zeit immer wiederkehren muss, steht daher in vollstem Einklange 
mit meinen Lehrsatzen. 

Nun cler Schluss, dass an clen mechanischen Grundanschauungen irgend 
etwas geandert oder diese gar aufgegeben werden miissten, darf daraus nicht 
gezogen werden. Dieser Schluss ware nur berechtigt, wenn sich aus den mecha- 
nischen Grundanschauungen irgend eine mit cler Erfahrung in Widerspruch 
stehende Consequenz ergabe. Dies ware aber nur der Fall, wenn Hr. Zerme- 
lo beweisen konnte, dass die Zeitdauer dieser Periode, innerhalb welcher der 
alte Zustand des Gases nach clem Poincare ’ schen Satze eintreten muss, eine 
beobachtbare Lange hat. Es diirfte nun zwar schon a priori evident sein, dass, 
wenn etwa eine Trillion winziger Kugeln, jede mit einer grossen Geschwin- 
digkeit begabt, zu Anfang der Zeit in einer Ecke eines Gefasses mit absolut 
elastischen Wanden beisammen waren, sich dieselben in kurzer Zeit ziemlich 
gleichmassig im Gefasse vertheilen werden, und dass die Zeit, wo sich alle 
ihre Stosse so compensirt haben, dass sie alle wieder in derselben Ecke zu- 




Rejoinder to the heat-theoretic considerations of Mr. E. Zermelo 



235 



Every location in space is equally probable for the first molecule, and so 
is every magnitude compatible with the principle of energy conservation for 
its first velocity component. 

However, by combining all states of all molecules we obtain Maxwell's 
law of velocity distribution in by far the greatest number of cases with great 
approximation. Only very few combinations yield a distribution of states 
entirely deviating from it. 

The theory of the method of least squares provides the following ana- 
logue, where for every basic error some positive or identical negative value 
is taken to be equally probable, and it is then proved that by combining 
all possible values of the basic errors in every way possible we obtain the 
Gaussian law of error for the greatest majority of combinations and that we 
find significant deviations from it only for a few combinations, whose number 
is vanishingly small by comparison and which are hence not impossible but 
infinitely improbable. 

An even simpler example is provided by the game of dice. If you roll 
the same die 6000 times, then you will randomly get 1000 ones, 1000 twos, 
etc.; but this is not because the sequence of rolls occurring incidentally is 
more probable than a sequence of 6000 ones, but only because a far greater 
number of possible combinations lead to an approximately equal number of 
ones, twos, etc., rather than only to ones. 

Hence, as has long been known, the calculus of probabilities also leads to 
the result that, from a mathematical point of view, a return to the original 
state is certainly not excluded and that it is even to be expected, assuming 
that the time of motion is sufficiently extended, since the probability of a 
state lying very close to the initial state is very small but not infinitely small. 
The consequence of Poincare's theorem, which states that, apart from a few 
singular state distributions, a state very close to the initial state must al- 
ways return after some, albeit very long, time has elapsed, is therefore fully 
consistent with my principles. 

Now, from this we must not conclude that the mechanical approach has to 
be modified in any way or even abandoned. This conclusion would be justified 
only if the mechanical approach had a consequence that runs contrary to 
experience. But this would be the case only if Mr. Zermelo were able to 
prove that the duration of this period within which the old state of the gas 
must occur in accordance with Poincare's theorem has an observable length. 
It should be a priori evident by now that if, say, a quintillion tiny spheres 
each of which is endowed with great velocity were assembled in one corner of 
the container with elastic walls at the beginning, they would be distributed 
fairly equally in the container after a short time, and that the period of time 
within which all of its collisions offset one another so that they all return 




236 



Boltzmann 1896 



sammenkommen, so gross sein muss, dass sie niemand zu erleben im Stande 
ist. Zum Ueberfiusse ergiebt die im Anhange beigefiigte Rechnung fiir diese 
Zeit einen Betrag, dessen enorme Grosse wahrhaft beruhigend ist. So wenig 
nun die im Anhange gegebene Rechnung irgend einen Anspruch auf Genauig- 
keit machen kann, so zeigt dieselbe doch, dass aus dem Poincare ' schen Satze 
jedenfalls nicht bewiesen werden kann, dass die theoretische Existenz einer 
Periode, nach welcher derselbe Zustand des Gases wiederkehrt, irgend einen 
Widerspruch mit der Erfahrung involvire, da die Lange dieser Periode jeder 
Beobachtbarkeit spottet. Die Zustande, die wir beobachten, aber fallen ja alle 
in die Zwischenzeit zwischen den Anfang und das Ende der Periode, wo der 
Poincare ' sche Satz Zustande, die sich im beliebigen Grade den Maxwell ' schen 
nahern, nicht ausschliesst. 

Der Zermelo ’ sche Fall ist daher nur einer jener vielen Falle (und zwar ein 
gegen die Gastheorie besonders wenig beweisender), wo ein theoretisch nur 
778 sehr unwahrscheinlicher | Zustand praktisch als niemals eintretend betrachtet 
werden muss. So miissen z.B. selbst bei gewohnlicher Temperatur im Knall- 
gase einzelne Molecule mit grosser Geschwindigkeit zu Zweien und selbst zu 
Dreien aufeinander stossen. Dasselbe muss sich also auch bei gewohnlicher 
Temperatur in Wasser verwandeln. 

Um ein anderes Beispiel zu geben, ist der Fall, dass in einem Gase wahrend 
einer Secunde kein Moleclil auf einen Stempel von bestimmter Grosse stosst, 
nur sehr unwahrscheinlich, nicht unmoglich. 

Die Zeit, wie lange man warten miisste, bis im Knallgase bei gewohnli- 
cher Temperatur eine messbare Wassermenge entsteht oder bis ein nicht allzu 
kleiner Stempel wahrend einer Secunde einen messbar kleinern Druck als den 
durchschnittlichen Gasdruck erfahrt, sind bei weitem nicht so lange, als die 
Zermelo'sche Periode, aber doch ausreichend lang, um jede Beobachtbarkeit 
auszuschliessen. Ein Argument gegen die Gastheorie konnte aus solchen Be- 
trachtungen nur claim abgeleitet werden, wenn clerartige Erscheinungen in 
Fallen ausblieben, wo sie nach der Rechnung in beobachtbaren Zeiten eintre- 
ten miissten. Dies scheint aber nicht der Fall zu sein, im Gegentheil: bei einer 
Temperatur, die tiefer als die allgemeine Umsetzungstemperatur ist, wurden 
wirklich Spuren chemischer Umsetzungen gefunden; ebenso wurden an ganz 
kleinen, in einem Gase befindlichen Korperchen Bewegungen wahrgegenom- 
men, welche davon herriihren konnen, dass in solchen Fallen in der That 
auf einem gegen ihre ganze Oberflache nicht mehr verschwindenden Theil 
derselben bald ein etwas grosserer, bald ein etwas kleinerer Druck wirkt. 

Wenn daher Hr. Zermelo aus der theoretischen Notwendigkeit, dass in ei- 
nem Gase der Anfangszustand wiederkehren muss, ohne zu berechnen, nach 
wie langer Zeit dies geschehen muss, den Schluss zieht, dass die Hypothesen 
der Gastheorie verlassen oder im Fundamente verandert werden miissen, so 
gleicht er einem Wiirfelspieler, welcher berechnet hat, dass die Wahrschein- 
lichkeit lOOOmal hintereinander ein Auge zu werfen nicht gleich Null ist und 
nun schliesst, dass seine Wiirfel falsch sein miissen, weil ihm dieser Fall noch 
nie vorgekommen ist. 




Rejoinder to the heat-theoretic considerations of Mr. E. Zermelo 237 



to the same corner must be so great that nobody can live to see it happen. 
Moreover, the length of this time period, which is obtained by calculations 
in the appendix, is truly staggering in its enormity. With no pretense to 
exactitude, the appended calculations show that we cannot conclude from 
Poincare ' s theorem that the theoretical existence of a period after which the 
same state of the gas returns involves any contradiction with experience since 
the length of this period defies any observation. But the states we observe all 
fall in the interim between the beginning and the end of the period, in which 
Poincare's theorem does not exclude states arbitrarily closely approximating 
the Maxwell states. 

Zermelo ' s case is therefore but one among a multitude of those cases (and 
one, at that, which does particularly little to refute the theory of gases) in 
which a state that is, theoretically speaking, highly improbable must, for 
all practical purposes, be taken never to occur at all. Thus, for instance, 
even at ordinary temperatures individual molecules in an oxyhydrogen gas 
must collide at great velocity in groups of two and even three. The gas must 
therefore turn into water even at ordinary temperatures. 

As another example, consider that the case where not a single gas molecule 
strikes a piston of a certain size within one second is only highly improbable 
but not impossible. 

The time it takes for a measurable amount of water to form in an oxy- 
hydrogen gas at ordinary temperatures or for a not excessively small piston 
to encounter a pressure measurably smaller than the average gas pressure 
within one second is by far not as long as the Zermelo period, yet long 
enough to exclude any possibility of observation. From such considerations 
we could derive an argument against the theory of gases only if phenomena 
of this sort failed to occur in cases where, according to the calculations, they 
would have to occur within measurable periods of time. But this does not 
seem to be the case here. On the contrary. Traces of chemical transforma- 
tions were actually found at a temperature lower than the general transfor- 
mation temperature; likewise, observations were made of movements of very 
small corpuscles, which may be due to the fact that in such cases a pressure, 
which is sometimes a little greater, sometimes a little smaller, really acts 
on a part of their surface that no longer vanishes compared to their entire 
surface. 

Hence, when Mr. Zermelo cites the theoretical necessity that, in a gas, 
the initial state must return, without calculating the time it takes for this 
to occur, in order to draw the conclusion that we must either abandon the 
hypotheses of the theory of gases or fundamentally change them, he is like a 
dice player who has calculated that the probability of rolling a one 1000 times 
in a row does not equal zero and now concludes that his dices are loaded on 
the ground that he has not yet come across this case. 




238 



Boltzmann 1896 



779 | Mit dem Vorgebrachten hangt nach meinen Ausfiihrungen an den Ein- 
gangs citirten Stellen der 2. Hauptsatz aufs Innigste zusammen. Auch er ist 
nach den molecular-theoretischen Anschauungen lediglich ein Wahrschein- 
lichkeitssatz. Nach diesen Anschauungen kann nicht aus den Bewegungsglei- 
chungen bewiesen werden, dass sich alle Erscheinungen immer in einem be- 
stimmten Sinne abspielen miissten. Bei alien Erscheinungen, wo nur sichtbare 
Bewegungen vorkommen, wo sich also die Korper bloss als Ganzes bewegen, 
muss jeder Bewegungssinn gleichberechtigt sein. Wo dagegen die Bewegung 
auf eine sehr grosse Anzahl sehr kleiner Moleciile iibergeht, diirfen wir, abge- 
sehen von verschwindend wenigen Fallen, die um so weniger zur Beobachtung 
gelangen konnen, je mehr Moleciile in’s Spiel kommen, den Uebergang von 
einem unwahrscheinlichen zu einem wahrscheinlicheren Zustande, also im- 
merwahrende Veranderungen in einem bestimmten Sinne erwarten, wie in 
einem Gase den Eintritt der Maxwelfschen Zustandsvertheilung. Wenn da- 
gegen die Bewegungen einzelner Molecule in Frage kamen, ware dies nicht 
mehr zu erwarten. 

Der erste und zweite Fall bestatigen sich in der Erfahrung: der dritte Fall 
wurde noch niemals realisirt. Seine Moglichkeit ist daher nicht bewiesen, aber 
auch nicht widerlegt. Namhafte Forscher, z. B. Helmholtz 1 , glaubten an die- 
selbe und wie ich in meinem Buche liber Gastheorie nachzuweisen suchte 2 , 
wird die Ansicht, dass der zweite Hauptsatz ein blosser Wahrscheinlichkeits- 
satz sei, durch die Thatsachen nicht nur nicht widerlegt, sondern dieselben 
schliessen sich dieser Ansicht sogar besonders gut an. Auch Gibbs 3 gelangt 
aus rein empirischen Thatsachen zu folgendem Schlusse: „The impossibility 
of an incompensated decrease of entropy seems to be reduced to an impro- 
bability." 

Wir kommen also zu folgendem Resultate: Wenn man die Warme als 
eine Bewegung von Moleciilen auffasst, welche gemass den allgemeinen Glei- 

780 chungen der Mechanik stattfindet | und annimmt, dass sich der Complex von 
Korpern, den wir wahrnehmen, jetzt gerade in einem sehr unwahrscheinlichen 
Zustande befindet, so ergiebt sich ein Satz, welcher fur alle bisher beobach- 
teten Erscheinungen mit dem zweiten Hauptsatze iibereinstimmt. 

Freilich sobald man Korper von so kleinen Dimensionen beobachtet, dass 
dieselben nur mehr wenige Molecule enthalten, muss die Giiltigkeit dieses 
Satzes aufhoren. Da aber iiber das Verhalten so kleiner Korper keinerlei Ver- 
suche vorliegen, so widerspricht diese Annahme keiner bisherigen Erfahrung; 
ja, gewisse mit sehr kleinen, in Gasen befindlichen Korpern vorgenommene 
Versuche sprechen eher zu ihren Gunsten, wenn man auch noch weit davon 
entfernt ist, von einem experimentellen Beweise il ire Jr] Richtigkeit sprechen 
zu konnen. 



1 Berl. Ber. 17 . p. 172. 1884; ebend. p. 34. Febr. 1882. 

2 1. c. p. 61. 

3 Gibbs, Conn. acad. trans. 3 . p. 229. 1875; Ostwald's deutsche Ausgabe p. 198. 




Rejoinder to the heat-theoretic considerations of Mr. E. Zermelo 



239 



According to considerations of mine to which I referred at the beginning, 
the foregoing remarks are intimately connected with the second law. On the 
molecular-theoretic approach, it, too, is but a principle of probability theory. 
According to this approach, it is impossible to prove on the basis of the 
equations of motion that all phenomena must always unfold in one direction. 
For all phenomena that only involve visible motions, and hence where the 
bodies only move as a whole, the directions of motion must all be equivalent. 
In contrast, when the motion is transferred to a very large number of very 
small molecules, we may expect the transition from an improbable state to a 
more probable state to take place, and hence perennial changes in a certain 
direction, such as the onset of the Maxwell state distribution in a gas, with 
the exception of vanishingly few cases, which become ever less observable 
as the number of molecules involved grows. On the other hand, this would 
no longer have to be expected if the motions of individual molecules were 
relevant. 

The first and the second case are confirmed by experience: the third case 
has not yet been realized. Hence, its possibility has not been proved. However, 
it has not been refuted either. Notable scientists such as Helmholtz 5 believed 
in it. Moreover, not only do the facts fail to refute the view that the second 
law is but a principle of probability theory, but they even line up with it 
particularly well, as I have tried to show in my book on the theory of gases . 6 
Gibbs , 7 too, infers from purely empirical facts the following conclusion: “The 
impossibility of an incompensated decrease of entropy seems to be reduced 
to an improbability.” 

We thus reach the following result: If we consider heat as a motion of 
molecules that occurs in accordance with the general equations of mechanics 
and assume that the complex of bodies that we perceive is currently in a 
highly improbable state, then a theorem follows that is in agreement with 
the second law for all phenomena so far observed. 

Of course, this theorem can no longer hold once we observe bodies of 
so small a scale that they only contain a few molecules. Since, however, we 
do not have at hand any experimental results on the behavior of bodies so 
small, this assumption does not run counter to previous experience. In fact, 
certain experiments conducted on very small bodies in gases seem rather to 
support the assumption, although we are still far from being able to assert 
its correctness on the basis of experimental proof. 



5 von Helmholtz 1884, P- 172; von Helmholtz 1882, p. 34. 

6 1. c. |[i.e., Boltzmann 1896a\ p. 61. 

7 Gibbs 1875, p. 229; Gibbs 1892, p. 198. 




240 



Boltzmann 1896 



Auch wenn die in Frage kommenden Korper sehr viele Molecule enthalten, 
miissen noch immer enorm kleine Abweichungen von diesem Satze eintreten, 
da die Zahl der Molecule nicht unendlich ist. Allein diese Abweichungen konn- 
ten nur in so langen Zeitraumen sich bis zu einem beobachtbaren Werthe sum- 
miren, dass auch diese Consequenz der Atomistik nicht durch die Erfahrung 
widerlegt wird. Dies gilt urn so mehr, da ja die Gastheorie nur beansprucht, 
ein angenahertes Bild der Wirklichkeit zu sein. Storungen, welche die Mo- 
lecularbewegung durch den Lichtather, durch electrische Eigenschaften der 
Molecule etc. erfahrt, muss sie wegen unserer volligen Unbekanntschaft mit 
der Natur dieser Agentien vernachlassigen, absolut glatte Wande kommen 
niemals vor, vielmehr steht jedes Gas mit dem ganzen Universum in Wech- 
selwirkung und die Zulassigkeit der Gastheorie im grossen und ganzen wird 
daher durch kleine Abweichungen von der Erfahrung nicht widerlegt. 

Eine Antwort auf die Frage, woher es komme, dass sich gegenwartig die 
uns umgebenden Korper gerade in einem sehr unwahrscheinlichen Zustan- 
de befinden, kann man natiirlich von der Naturwissenschaft ebenso wenig 
erwarten, wie etwa auf die Frage, woher es komme, dass es iiberhaupt Er- 
scheinungen gibt und dass sich dieselben nach gewissen gegebenen Gesetzen 
abspielen. 

781 | Die Gastheorie ist nicht zu verwechseln mit der Kraftcentratheorie, d. h. 

mit der Hypothese, dass sich alle Naturerscheinungen durch Centralkrafte 
zwischen materiellen Punkten erklaren lassen, da die Gastheorie weder die 
Voraussetzung macht, dass sich das Verhalten des Lichtathers, noch dass sich 
die innere Beschaffenheit der Molecule durch Kraftcentra erklaren lasst, son- 
dern bloss, dass fiir die Wechselwirkung zweier Molecule wahrend der Zusam- 
menstosse mit einer fiir die Erklarung der Warmeerscheinungen geniigenden 
Annaherung die Lagrange ’schen Bewegungsgleichungen gelten. 

Gegen diese letztere Kraftcentratheorie konnte noch eine Consequenz des 
Poincare ’’ schen Satzes beziiglich des Verhaltens des ganzen Universums ins 
Feld gefiihrt werden. Man konnte sagen, dass nach dem Poincare ’ schen Satze 
auch das ganze Universum nach geniigend langer Zeit in seinen Anfangs- 
zustand zuriickkehren miisste und daher Zeiten kommen miissten, wo sich 
alle Vorgange im entgegengesetzten Sinne wie jetzt abspielen. Allein der- 
artige Schlrisse scheinen mir jeder Berechtigung zu entbehren. Wie sollen 
wir, sobald wir die Sphare des Beobachtbaren verlassen, entscheiden, ob die 
Existenzdauer des Universums oder die Anzahl der Kraftcentra, welche es 
enthalt, unendlich gross hoherer Ordnung ist? Auch wird dann die Annah- 
me, dass der Bewegungsraum und der gesammte Energieinhalt endlich sind, 
fraglich. Es fiihrt ja auch die Annahme der unbedingten Giiltigkeit des Irre- 
versibilitatsprincips bei Anwendung auf das Universum unter Voraussetzung 
einer unendlich langen Dauer desselben bekanntlich zu der kaum mehr ver- 
lockenden Consequenz, dass, wenn sich alle irreversiblen Processe abgespielt 
haben, das Universum noch unendlich lange Zeit ohne jedes Geschehen fort- 
existiren oder wegen Mangels an Geschehen allmahlich verschwinden muss. 




Rejoinder to the heat-theoretic considerations of Mr. E. Zermelo 



241 



Even if the bodies under consideration contain a great number of mol- 
ecules, enormously small deviations from the principle must still occur, since 
the number of molecules is not infinite. These deviations, however, could 
only add up to observable values in periods of time so long that experience 
fails to refute this consequence of the atomistic theory as well. This is all 
the more true as the theory of gases is only supposed to be an approximate 
picture of reality after all. It is bound to neglect disturbances of molecular 
motions brought about by the ether, or by the electrical properties of the 
molecules etc. owing to our complete ignorance of the nature of these agents. 
Perfectly smooth walls do not exist. Rather, every gas interacts with the 
entire universe, and, by and large, the legitimacy of the theory of gases is 
therefore not refuted by minor deviations from experience. 

Of course, we cannot expect natural science to answer the question as to 
why the bodies surrounding us currently exist in a highly improbable state, 
just as we cannot expect it to answer the question as to why there are any 
phenomena at all and why they adhere to certain given principles. 

The theory of gases is not to be confused with the theory of force centers, 
i.e., with the hypothesis that all phenomena in nature can be explained by 
means of central forces acting between material points, since the theory of 
gases assumes neither that the behavior of the ether can be explained by 
means of force centers nor that the inner constitution of molecules can be 
so explained. It only assumes that Lagrange ' s equations of motion hold for 
the interaction of two molecules during the collisions with an approximation 
sufficient for the explanation of the thermal phenomena. 

We can also use a consequence of Poincare's theorem as applied to the 
behavior of the entire universe as an argument against the theory of force 
centers. One could say that, according to Poincare's theorem, the entire uni- 
verse, too, must return to its initial state after a sufficient amount of time 
and that hence there will have to be times when all processes unfold in the 
direction opposite to the current direction. Yet such conclusions are, in my 
view, completely lacking in justification. Once we have left the sphere of 
the observable behind us, how are we to decide then whether the duration 
of the universe, or the number of force centers contained in it, is infinitely 
large of higher order? Moreover, the assumption that the space of motion 
and the entire energy content are finite then becomes questionable. Also, it 
is well-known that the assumption of the strict validity of the principle of 
irreversibility, when applied to the universe on the supposition of its infinite 
duration, leads to the hardly more appealing consequence that once all irre- 
versible processes have completely unfolded, the universe must continue to 
exist for an infinite amount of time without any events occurring in it or that 
it will gradually fade away for want of events. While it would be illegitimate 




242 



Boltzmann 1896 



So wenig es nun berechtigt ware, hieraus auf die Unrichtigkeit des Irrever- 
sibilitatsprincips Schliisse zu ziehen, so wenig beweist der gleiche Fall etwas 
gegen die Atomistik. 

Alle gegen die mechanische Naturanschauung erhobenen Einwande sind 
daher gegenstandslos und beruhen auf Irrthiimern. Wer aber die Schwierig- 
keiten, welche die klare Erfassung der gastheoretischen Satze bietet, nicht zu 
782 iiberwinden | vermag, der sollte in der That dem Rathe Hrn. Zermelo ’ s folgen 
und sich entschliessen, dieselbe ganz aufzugeben. 



Anhang. 



Wir setzen ein Gefass von 1 com Rauminhalt voraus. Darin soil sich Luft 
von gewohnlicher Dichte, also rund eine Trillion (n) Molecule befinden. Die 
Geschwindigkeit eines jeden sei anfangs 500 m pro Secunde. Der mittlere Ab- 
stand der Centra zweier Nachbarmoleciile ist also etwa 10 _6 cm. 

Wir construiren nun um den Mittelpunkt jedes Moleciils einen Wiirfel 
von 10 _ ' cm Seitenlange, welchen wir den Anfangsraum des betreffenden 
Moleciils nennen. Wir zeichnen ferner das Geschwindigkeitsdiagramm, in- 
dem wir die Geschwindigkeit jedes Moleciils vom Coordinatenursprunge aus 
in Grosse und Richtung auftragen. Der Endpunkt dieser Geraden heisse der 
Geschwindigkeitspunkt des betreffenden Moleciils. Hierauf theilen wir den 
ganzen unendlichen Raum in lauter Wiirfel von 1 m Seitenlange, welche wir 
die Elementarwiirfel nennen. Denjenigen Elementarwiirfel, in welchem sich 
der Geschwindigkeitspunkt eines Moleciils zu Anfang der Zeit befindet, nen- 
nen wir den Anfangsraum seines Geschwindigkeitspunktes. 

Wir fragen nun zunachst, nach wie langer Zeit gemass des Poincare ' schen 
Satzes die Centra sowie die Geschwindigkeitspunkte aller Moleciile wieder 
gleichzeitig in die betreffenden Anfangsraume zuriickkehren miissen, wobei 
wir, wie man sieht, den Spielraum fiir das, was wir Riickkehr zu einem glei- 
chen Zustande nennen, gewiss nicht enge gezogen haben, da wir den Ge- 
schwindigkeitszustand eines Moleciils als den alten bezeichnen, wenn jede 
seiner Geschwindigkeitscomponenten zu einem Werthe zuriickgekehrt ist, der 
sich um nicht mehr als 1 m von seinem urspriinglichen Werthe unterscheidet. 

Wir nehmen an, dass jedes Moleciil in der Secunde 4 • 10 9 Zusammen- 
stosse erfahrt. Es erfolgen also im ganzen in der Secunde etwa b = 2 • 10 2 ' 
Zusammenstosse im Gase. Bei jedem solchen Zusammenstosse werden im 
allgemeinen die Geschwindigkeitspunkte zweier Moleciile in andere Elemen- 
tarwiirfel versetzt. Nach dem Poincare’ schen Satze braucht der urspriingliche 
783 Zustand nicht friiher wiederzukehren, bis | die Geschwindigkeitspunkte alle 
moglichen (N) Combinationen von Elementarwiirfeln durchlaufen haben. 

Das erste Moleciil kann alle Geschwindigkeiten von Null bis (500- 10 9 = a) 
m/sec annehmen. Hat es die Geschwindigkeit v\ x m/sec, so kann das zweite 
noch alle Geschwindigkeiten von Null bis \J a 1 — v\ m/sec annehmen etc. 




Rejoinder to the heat-theoretic considerations of Mr. E. Zermelo 



243 



to infer from this case that the principle of irreversibility is incorrect, it would 
be equally wrong to assume that it refutes the atomistic theory in any way 
whatsoever. 

All objections against the mechanical approach to nature are therefore 
unfounded and based on mistakes. Anyone unable to overcome the difficulties 
attendant on a clear understanding of the principles of the theory of gases 
really ought to heed Mr. Zermelo ' s advice and resolve to abandon the theory 
altogether. 



Appendix. 

Consider a container of 1 cc capacity which holds air at ordinary density, and 
hence contains about one quintillion (n) molecules. Set the velocity to 500 m 
per second at the beginning. Hence, the mean distance between the centers 
of two neighboring molecules is about 10 -6 cm. 

We now imagine a cube of edge length 10“ 7 cm which is constructed 
around the center of each molecule and to which we refer as the initial space 
of the molecule. Furthermore, we draw the velocity diagram by plotting the 
velocity for each molecule by magnitude and direction from the point of 
origin. The endpoint of this straight line is said to be the velocity point of 
the molecule under consideration. We then divide the entire infinite space 
into cubes of edge length 1 m, which are called the “elementary cubes”. The 
elementary cube containing the velocity point of a molecule at the beginning 
of the time interval is said to be the initial space of its velocity point. 

Now, we first ask how long, according to Poincare's principle, it takes for 
both the centers and the velocity points of all molecules to simultaneously 
return to their respective initial spaces, where the scope of the term “return 
to the same state” is evidently not taken too narrowly, since we recognize a 
velocity state of a molecule as its old velocity state when each of its velocity 
components returns to a value that differs from the original value by not 
more than 1 m. 

Let us assume that every molecule experiences 4- 19 s collisions per second. 
Hence, the total number of collisions in the gas per second is about b = 2- 10 27 . 
As a rule, at each collision, the velocity points of two molecules are displaced 
into another elementary cube. According to Poincare's principle, the initial 
state need not recur before the velocity points have passed through all possible 
(N) combinations of elementary cubes. 

The first molecule can assume any velocity from zero to (500 • 10 9 = a) 
m/s. If its velocity is zqx m/s, then the second molecule can assume any 
velocity from zero to \J a 1 — vf m/s, etc. 




244 



Boltzmann 1896 



Die Anzahl aller moglichen Combinationen, aller Geschwindigkeitspunkte 
in die verschiedenen Element arwiirfel ist also: 



a y/a 2 -vf 

N = (47r) n_1 J v\dv i J v%dv 2 • 






a* — vt---vz. 



J vl^d-Vn - 1 



7 r ^ a 3(n-l) 



2 • 3 • 4 • • • [3(n — l)/2] ’ 



oder 



2 -(27 t) 3 ^ 3 *"" 1 ) 

3 • 5 • 7 - • • 3(n — 1) ’ 

je nachdem n ungerade oder gerade ist. 

Da jede dieser Combinationen durchschnittlich nach 1/6 Secunde wech- 
selt, so werden sie alle in N/b Secunden durchlaufen sein. Nach dieser Zeit 
also miissten alle Moleciile bis auf eines das erlangt haben, was wir ihren ur- 
spriingliclien Geschwindigkeitszustand nannten. Dabei ist noch die Geschwin- 
digkeitsrichtung dieses letzten Moleciils gar nicht beschrankt, ebenso wenig 
die Lage des Mittelpunktes irgend eines Moleciils. Damit aber der Zustand 
wieder der alte wiirde, miisste auch der Mittelpunkt jedes Moleciils wieder in 
seinen Anfangsraum zuriickkehren, also die obige Zahl noch mit einer zweiten 
von ahnlicher Grossenordnung multiplicirt werden. 

Wie gross aber schon die Zahl N/b ist, davon erhalt man einen Begriff, 
wenn man bedenkt, class sie viele Trillionen Stellen hat. Wenn dagegen um 
jeden mit dem besten Fernrohr sichtbaren Fixstern so viele Planeten, wie um 
die Sonne kreisten, wenn auf jedem dieser Planeten so viele Menschen wie 
auf der Erde waren und jeder dieser Menschen eine Trillion Jahre lebte, so 
hatte die Zahl der Secunden, welche alle zusammen erleben, noch lange nicht 
fiinfzig Stellen. 

784 | Waren hingegen die Gasmoleciile anfangs im Gefasse ziemlich gleichmas- 

sig vertheilt, hatten aber alle genau dieselbe Geschwindigkeit, so wiirde sich 
schon nach einhundertmilliontel Secunde sehr nahe die Maxwell ’ sche Ge- 
schwindigkeitsvertheilung hergestellt haben. Die Vergleichung dieser Zahlen 
zeigt einerseits, einen wie kleinen Bruchtheil der Zahl aller moglichen Zu- 
standsvertheilungen diejenigen bilden, welche von der Maxwell ’ schen erheb- 
lich abweichen, andererseits wie zweifellos solche Satze, welche theoretisch 
nur den Charakter von Wahrscheinlichkeitssatzen haben, praktisch mit Na- 
turgesetzen gleichbedeutend sind. 



Wien, den 20. Marz 1896. 




Rejoinder to the heat-theoretic considerations of Mr. E. Zermelo 



245 



The number of all possible combinations, of all velocity points in the 
various elementary cubes is therefore: 



a \/a 2 -vl y/ a 2 -vj---v 2 _ 2 

N = (47r) n_1 J v\dv i J v\dvi • • • J vl-id ■ u„_i 






2 • 3 • 4 - • • [3(n — l)/2] ’ 



or 



2 -(2 

3 • 5 • 7 • • • 3(n — 1) ’ 

depending on whether n is even or odd. 

Since each of these combinations is replaced by another one after 1/5 
seconds on average, all of them will have had their turn after N/b seconds. 
Hence, after this time, all molecules, except one, will have reached what we 
called its initial velocity state. Here, the velocity direction of this last molecule 
has not even been restricted yet; nor has the location of the center of any 
molecule. But for the state to turn into its former self the center of each 
molecule would have to return to its initial space as well. Hence, the number 
above would have to be multiplied by another one of similar magnitude. 

In order to get an idea of how great even the number N/b is, just consider 
that it has many quintillions of digits. On the other hand, if we assume 
that every fixed star visible through the best telescope is orbited by as many 
planets as the sun, and that there are as many people on each of these planets 
as there are on earth, and that each of them has a life span of one quintillion 
years, then the number of seconds of all their life spans combined is far from 
reaching 50 digits. 

If, in contrast, the gas molecules were fairly equally distributed in the 
container at the beginning but all had exactly the same velocity, then the 
Maxwell velocity distribution would be almost completely established after 
merely one one-hundred-millionth part of one second. The comparison of 
these numbers suggests, on the one hand, that the number of state distribu- 
tions deviating from the Maxwell state distribution is but a small fraction of 
the number of all possible state distributions, and, on the other hand, that 
those laws which, theoretically speaking, only have the character of principles 
of probability theory are, without any doubt, practically equivalent to laws 
of nature. 



Vienna , March 20, 1896. 




